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1.  Summary 

The  project  was  started  in  September  1996.  Its  goal  was  to  develop  techniques 
for  continual  reallocation  of  resources  to  maintain  application  performance 
despite  statically  unpredictable  change  in  resource  demands.  It  was  targeted  to 
multiple  application  systems  executing  on  HPC  (High  Performance  Computing) 
platforms.  It  was  anticipated  that  such  adaptive  capability  would  be  needed  in 
military  systems  such  as  SC-21. 

As  planned  for  this  project,  we  built  on  the  results  of  a  previous  program,  called 
Adaptive  Resource  Allocation  (ARA).  In  ARA,  we  developed  techniques  for 
dynamic  reallocation  of  resources  to  single  parallel  applications,  structured  as 
multi-pipelines,  executing  on  a  high-performance  parallel  machine.  We  extended 
ARA  results  to  systems  with  multiple  applications  and  multiple  machines 
connected  over  a  network. 

In  October  1997,  with  DARPA  approval,  we  decided  to  merge  the  technical  effort 
on  this  project  with  the  RTARM  project  funded  under  Quorum.  This  did  not  affect 
the  core  statement  of  work  for  ARM,  but  led  to  a  6-month  extension  of  its 
completion  date  from  November  1998  to  May  1999.  ARM  still  focused  on 
developing  an  approach  based  on  adaptation  models,  and  addressed  best-effort 
resource  allocation  in  an  environment  with  partitionable  rather  than  shared 
resources.  However,  parallel  HPC  platforms  were  de-emphasized  in  favor  of 
general  distributed  computing  platforms.  Some  of  the  work  we  had  completed,  in 
particular  the  software  infrastructure  for  managing  multiple  MPI-based 
applications,  became  less  relevant. 

Results  from  ARM  are  being  integrated  into  RTARM.  The  layered  architecture  of 
ARM  has  given  way  to  a  hierarchical  architecture  characterized  by  uniformity 
across  different  levels.  The  MPI-based  communication  infrastructure  in  ARM  has 
given  way  to  a  CORBA  ORB  infrastructure.  While  ARM  implementation  was 
targeted  to  Unix  machines  connected  over  Ethernet,  the  target  platform  for 
RTARM  consists  of  Windows  NT  machines  networked  over  ATM. 

The  work  was  performed  jointly  by  Honeywell  Technology  Center  and  Georgia 
Institute  of  Technology,  under  Honeywell  direction.  This  report  describes  only  the 
work  performed  under  ARM;  hence,  it  represents  an  intermediate  snapshot  of  the 
larger  merged  research. 

2.  Report  layout 

The  report  contains  the  following  sections.  A  brief  description  of  each  section  is 
given  below  to  establish  context  before  details  are  presented.  The  list  of  sections 
follows  the  list  of  tasks  in  the  statement  of  work. 

•  Program  Objective  -  This  section  describes  the  general  characteristics  of 
the  targeted  applications  and  the  overall  problem  that  ARM  addresses. 


1 


•  Adaptation  Models  -  This  section  describes  those  attributes  of 
applications  and  the  underlying  resources  that  are  needed  by  ARM.  Four 
distinct  models  are  described  - 

1.  Application  Execution  Models  capture  the  manner  in  which  applications 
consume  resources. 

2.  Performance  and  Timing  Models  capture  the  performance 
requirements  of  applications  in  a  system. 

3.  Decision  Models  contain  information  about  run-time  detection  of 
significant  transitions  in  performance. 

4.  Resource  Allocation  Models  determine  how  to  allocate  and  reallocate 
resources  across  applications  and  within  applications 

5.  Enactment  Models  describe  when  and  how  a  new  allocation  should  be 
brought  into  effect,  given  the  potential  cost  and  perturbation  of 
reallocation. 

The  main  motivation  for  separating  information  into  these  models  was  to 
support  a  flexible  architecture  for  ARM  with  plug  and  play  capability. 

•  ARM  Architecture:  This  section  describes  the  layered  adaptation 
architecture  we  developed.  Each  layer  engages  in  negotiation,  service 
translation,  real-time  monitoring  and  adaptation. 

•  Real-Time  Instrumentation:  We  present  an  overview  of  the  existing  real¬ 
time  instrumentation  system,  SPI,  and  describe  the  changes  we  made  to  it 
for  ARM. 

•  ARM  Run-Time  System:  This  section  describes  the  main  components  of 
the  run-time  support  for  adaptive  resource  management,  including  the 
software  infrastructure. 

•  Demonstrations:  This  section  describes  the  applications  we  demonstrated 
to  show  proof  of  ARM  concepts. 

Finally,  we  have  attached  a  set  of  papers  that  represent  the  work  performed 
under  this  project  or  built  upon  it. 

3.  Program  Objective 

Future  defense  systems  will  likely  be  characterized  by  dynamic  variability  in  the 
performance  demands  of  their  applications.  Many  embedded  DoD  applications 
will  be  reactive,  as  they  must  interact  with  changes  in  an  external  physical 
environment.  Often  their  run-time  behavior  will  also  be  heavily  data-dependent, 
depending  on  scene  parameters,  sensor  modality,  range  to  target,  etc. 
Consequently,  their  computing  resource  requirements  will  tend  to  vary 
considerably  during  execution,  and  for  the  most  part  be  statically  unpredictable. 

We  refer  to  such  systems  as  deployable  systems.  Given  their  time-varying  and 
irregular  resource  needs,  it  will  be  necessary  to  manage  resources  dynamically. 
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Without  dynamic  adaptation  in  resource  allocation,  either  computing  platforms  for 
deployable  systems  will  have  to  be  oversized  or  they  will  fail  to  meet  the 
application  requirements.  In  addition,  in  future  military  systems,  the  demand  for 
higher  agility  will  further  require  applications  to  be  adaptive. 

Effective  management  of  computing  resources  in  such  an  environment,  and  the 
adaptation  of  individual  application  subsystems  is  a  challenging  task.  Deployable 
systems  are  different  from  the  computing  systems  used  in  ground-based 
command  and  control  operations  over  geographical  dispersed  areas.  In 
deployable  systems,  applications  are  often  interdependent;  the  performance 
requirements  are  usually  stringent,  and  the  applications  tend  to  be  more  dynamic 
because  they  are  embedded  in  a  potentially  rapidly  changing  environment. 

The  objective  of  ARM  was  to  provide  adaptive  resource  management 
mechanisms  for  specific  models  of  applications,  computing  environments  and 
resource  usage.  Adaptation  is  viewed  in  terms  of  continual  allocation  and 
reallocation  of  resources  among  the  applications  constituting  a  system  to  meet 
system-wide  objectives. 


4.  Adaptation  Models 


4.1  Application  Models 

An  application  model  determines  how  resources  are  requested  and  consumed. 
Some  applications  may  be  distributed  across  multiple  computers.  For  example^ 
the  front  end  of  a  sensor-based  application  may  be  implemented  on  a  SIMD 
machine,  whereas  the  back  end  object  processing  is  often  implemented  on 
general  purpose  MIMD  machines.  We  assume  that  the  data-parallel  components 
of  applications  are  implemented  on  MIMD  computers  as  SPMD  programs,  which 
is  a  common  style  for  hand-written  codes  as  well  as  codes  produced  by 
compilers  for  parallel  languages  such  as  HPF. 

Multiple  applications  may  run  simultaneously  on  a  computer.  The  nodes  of  a 
computer  may  be  partitioned  across  applications  using  either  space  multiplexing 
or  time  multiplexing.  In  our  research,  we  limited  ourselves  to  space  sharing  as  it 
much  more  common  in  commercial  HPC  computers.  Time-sharing  is  beset  with 
severe  performance  penalties  due  to  context  switching  and  the  difficulty  of  co¬ 
scheduling  an  application’s  tasks  on  multiple  nodes  on  multiple  computers. 

Workload  Model 

The  workload  is  a  simplified  multi-pipeline  where  individual  stages  may  be 
tagged  as  parallel  programs.  A  pipeline  is  an  acyclically  connected  set  of  stages. 
A  stage  has  zero  or  more  inputs  and  zero  or  more  outputs.  Stages  are  connected 
bv  connecting  an  output  of  one  stage  to  an  input  of  another.  No  inputs  or  outputs 
are  unconnected.  Signal  sources  are  modeled  as  stages  with  no  input,  signal 
sinks  as  stages  with  no  output.  Currently,  we  assume  that  there  is  only  one 

source  and  one  sink. 
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Figure  1:  Multi-Pipeline 

The  intent  is  to  use  a  recursive  definition,  so  that  even  a  connected  subset  of 
stages  or  just  one  stage,  may  be  viewed  as  a  pipeline.  Invocation  of  a  stage  is 
input-driven.  Output  is  always  invocation-driven. 

The  more  complicated  the  model  is,  the  more  complicated  is  the  service  request 
translation  (SRT).  We  decided  to  stay  with  simpler  models  because  SRT  was  not 
the  central  focus  of  our  research. 


Figure  2:  A  pipeline  is  a  recursive  structure  of  stages 

•  The  workload  consists  of  a  set  of  multi-pipeline  applications,  each  with 
end-to-end  QoS  requirements. 

•  A  multi-pipeline  is  a  DAG  of  stages,  each  stage  with  zero  or  more  in  arcs 
(inputs),  computational  load,  and  zero  or  more  out  arcs  (outputs). 

•  Stage  invocation  may  be  periodic  (allowed  only  for  source  stages)  or  input- 
driven.  An  input-driven  stage  is  invoked  when  a  specified  number  of 
arrivals  occur  on  each  of  its  inputs. 

•  Computational  load  in  a  stage  varies,  depending  on  data  sizes  at  input, 
and  data-dependence. 

•  A  stage  issues  output  once  after  every  invocation.  The  output  data  size 
may  vary  depending  upon  input  data  sizes. 

•  For  parallel  stages,  a  description  of  the  parallelism  in  it. 

Parallel  programs  are  often  described  as  task  graphs  (TG)  consisting  of  tasks 
linked  by  communication  edges.  Task  graphs  have  no  temporal  information 
about  when  the  communications  take  place.  We  decided  to  use  the  Temporal 
Communication  Graph  (TCG)  concept  from  Origami,  which  provides  an  unrolling 
of  static  task  graphs  in  time.  TCG  and  TG  are  static,  in  that  the  number  of  tasks 
and  edges  between  them  do  not  change  at  run-time. 

Our  target  description  expresses  parallel  computations  as  a  function  of  the 
number  of  processors  on  which  it  is  mapped.  Our  objective  in  a  specification 
mechanism  for  temporal  information  was  that  we  could  provide  a  communication 
traffic  description  to  NetEx  (i.e.  source,  destination,  and  traffic  pattern)  when  we 
transitioned  from  ARM  to  RTARM. 
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Stage  Si  S2  S3 


Figure  3:  Workload  Model:  Pipeline  with  Parallel  Stages 

We  use  an  enhanced  version  of  TCG  that  makes  it  into  a  template  for 
instantiation  based  on  the  number  of  processors  allocated  to  it. 

4.2  Performance  and  Timing  Models 

QoS  is  multi-dimensional,  each  dimension  viewed  as  a  range  of  values  -  low, 
high  and  a  set  of  thresholds  that  define  points  at  which  some  specific  action  is  to 
be  taken.  For  example,  a  drop  in  QoS  below  a  threshold  might  trigger  adaptation. 

QoS  includes  quality  dimensions  and  service  dimensions.  The  quality  dimensions 
include  - 

•  Throughput  as  a  function  of  input  rate,  reckoned  at  output. 

•  End-to-end  latency  between  source  and  sink. 

Service  dimensions  include  per-stage  specification  of  - 

•  Computational  load  as  a  function  of  input  data  sizes,  and  specification  for 
each  output  data  size  as  a  function  of  input  data  sizes. 

•  Invocation  rate 


4.3  Decision  Models 

A  critical  component  of  the  reallocation  process  is  the  decision  model  that 
determines  when  a  reallocation  of  resources  is  necessary.  As  described  earlier, 
applications  are  modeled  as  an  acyclic  graph  of  data-parallel  tasks.  Data  frames 
are  pipelined  through  this  graph  and  each  of  these  data-parallel  tasks  can  be 
further  structured  as  a  collection  of  subtasks,  each  running  on  an  individual 
processor.  The  number  of  subtasks  within  a  task  varies  as  processors  are 
dynamically  allocated  to  and  deallocated  from  the  original  task.  The  subtasks  are 
instrumented  to  provide  performance  measurements  in  real-time.  Detectors 
process  these  instrumented  streams  of  data  to  produce  detection  events  each 
siqnaling  a  major  change  in  performance  metrics.  Decision  models  process  these 
streams  of  detection  events  to  determine  if  resource  reallocation  is  necessary, 
and  if  so,  to  initiate  procedures  for  the  computation  and  enactment  of  new 
reallocations.  In  ARM,  we  address  the  reallocation  of  processors  among  tasks  to 
maintain  minimal  frame  latency  through  the  task  graph. 
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The  majority  of  existing  research  on  resource  allocation  and  reallocation  is 
focused  on  algorithms  that  determine  how  to  most  effectively  allocate  or 
reallocate  resources.  There  is  an  extensive  literature  on  dynamic  resource 
allocation,  typically  in  the  context  of  load  balancing  algorithms.  Strategies 
typically  focus  on  where  tasks  must  be  scheduled  as  function  of  available 
resources.  Research  that  is  more  recent  has  studied  dynamic  processor 
scheduling  algorithms  in  multiprocessor  systems  and  even  algorithms  for 
dynamic  control  of  communication  resources  in  parallel/distributed  applications. 

These  resource  allocation  algorithms  rely  on  the  existence  of  a  mechanism  that 
determines  when  they  are  invoked,  for  example,  at  task  arrival  time.  This  does 
not  permit  reaction  to  run-time  load  variations  within  the  application.  We  decided 
that  for  run-time  reallocation,  it  is  critical  to  be  able  to  determine  when  such 
resource  reallocation  algorithms  should  be  invoked  during  task  execution. 
Accurate  timing  can  avoid  thrashing  during  transient  workload  changes,  permit 
low  latency  reallocation,  and  in  some  instances  preempt  performance 
degradation  by  predicting  reallocation  needs. 

Georgia  Tech  developed  a  combination  of  a  low-latency  decision  model  that  is 
reactive  in  nature  with  a  relatively  more  complex  decision  model  that  is  predictive 
in  nature.  The  model  is  quite  insensitive  to  transient  workload  shifts  or  "spikes", 
thereby  reducing  ineffective  reallocations.  The  model  is  also  quite  effective  in 
predicting  impending  workload  changes.  Thus,  the  decision  model  can  be 
"tuned"  based  on  some  knowledge  of  the  application  behavior.  Using  a  synthetic 
benchmark  generator,  we  experimentally  demonstrated  an  increase  in 
performance  and  a  decrease  in  overhead  across  a  range  of  input  data 
parameters.  While  the  current  implementations  are  focused  on  a  class  of 
computationally  intensive  sensor-processing  applications,  these  decision  models 
are  more  generally  applicable  to  asynchronous,  event-driven  computational 
models. 

By  coupling  the  reactive  Bayesian  model  with  the  predictive  Markovian  model, 
we  create  a  multi-level  decision  model  capable  of  improving  the  performance  of 
adaptive  resource  managers  under  a  variety  of  input  conditions.  Under  average 
input  conditions,  both  models  contribute  to  decrease  the  end-to-end  latency  of 
input  frames  and  reduce  the  decision  and  enactment  overhead.  Toward  the 
extremes,  the  Bayesian  model  proves  more  applicable  to  high  noise 
environments  and  the  Markovian  model  better  suited  to  low  noise  environments. 
In  these  situations,  the  less  suited  model  provides  good  backup  support  for  the 
more  effective  model. 

Under  low  noise  conditions,  the  Bayesian  level  keeps  track  with  the  baseline 
model  while  the  Markovian  level  pushed  the  system  toward  more  acceptable 
performance  states.  Under  high  noise  conditions,  the  Bayesian  level  filters  a 
much  larger  percentage  of  the  input  spikes  while  the  Markovian  level  ensured 
performance  did  not  fall  below  the  real-time  specifications.  Over  a  wide  range  of 
input  streams,  the  coupled  model  is  shown  to  maintain  or  improve  the  latency 
performance  while  decreasing  the  number  of  false  triggers  and  unnecessary 
resource  reallocations. 
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Ideas  for  future  work  include  methods  for  dynamically  varying  the  Bayesian  and 
Markovian  thresholds  in  response  to  the  current  task-level  resource  allocation, 
and  implementing  mechanisms  for  the  Markovian  model  to  suggest  appropriate 
resource  allocations  for  the  predicted  steady-state  behaviors. 


4.4  Resource  Allocation  Models 

It  is  desirable  that  the  underlying  machines  appear  to  the  applications  as  one 
virtual  machine  that  can  be  customized  according  to  their  individual  and 
collective  needs.  This  customizing  should  take  place  under  control  of  the 
applications  as  well  as  automatically  when  a  significant  change  in  resource 
demands  or  availability  is  detected  by  the  resource  management  system 
cognizant  of  applications  characteristics. 


4.4.1.  Allocation  and  Assignment 

Mapping  an  application  to  a  heterogeneous  target  platform  is  a  two-part  problem: 

•  Allocation,  which  concerns  the  partitions  of  individual  machines  that  are 
allocated  to  individual  applications 

•  Assignment,  which  concerns  the  mapping  of  software  components  to 
specific  processing  resources,  and  may  involve  consideration  of 
interprocessor  communication  behavior  of  the  applications.  We  will  use  the 
term  allocation  (or  mapping,  configuration)  to  include  both  allocation  and 
assignment  from  hereon. 

The  ARM  system  should  provide  continual  on-line  reallocation  of  resources  to 
meet  the  overall  mission  objectives.  The  following  types  of  events  may  trigger 
resource  reallocation  - 

•  Arrival  and  departure  of  applications 

•  Request  by  applications  e.g.  when  an  application  knows  it  is  about  to  enter 
a  significantly  different  phase  of  computation 

•  Based  on  potential  performance  shortfalls  detected  by  the  ARM 

•  Request  by  the  user,  e.g.  on  a  mode  change 

ARM  can  be  effective  only  if  the  overhead  of  reallocation  is  significantly  lower 
than  the  cost  of  doing  no  reallocation.  Sufficiently  fast  algorithms  are  needed  to 
compute  a  new  allocation.  Because  of  resource  reallocation,  application 
components  may  migrate  across  heterogeneous  computers,  with  possibly 
significant  change  in  the  application  s  performance. 


4.4.2.  Resource  Models 

We  adopted  a  hierarchical  resource  model,  with  a  flat  allocation  model  For  every 
resource  in  the  system,  a  certain  amount  of  resource  is  free,  tested  (for 
reservation)  or  reserved.  For  illustration,  if  a  resource  manager  manages  each 
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resource,  one  can  represent  a  parallel  machine  like  the  Cray  T3D  as  a  machine 
where  the  unit  of  allocation  is  the  node. 
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Figure  4:  Illustration  of  Resource  Model 

Similarly,  one  can  represent  an  IBM  SP2  with  several  nodes,  where  the  unit  of 
allocation  is  a  percentage  of  processor  allocation. 

5.  ARM  Architecture 

ARM  is  divided  into  multiple  layers,  including  an  application  layer  and  one  or 
more  resource  layers.  The  application  layer  (A-Layer)  is  concerned  with  resource 
management  issues  relating  to  specific  application  models,  and  performance  of 
the  entire  application  rather  than  its  parts.  The  resource  layer  (R-Layer)  is 
common  across  many  different  application  models,  and  encapsulates  any 
hierarchy  in  the  resources.  For  example,  system  layers  in  a  network  protocol 
stack  belong  in  the  R-Layer,  and  multiprocessor  clusters  may  treat  the  cluster 
and  individual  multiprocessors  as  different  layers.  Potentially,  there  may  also  be 
a  separate  mission  layer,  which  addresses  mission-level  objectives  and  tradeoffs 
across  applications  to  achieve  them.  For  now,  only  the  A-layer  and  R-layer  are 
considered. 

Each  layer  is  characterized  by:  workload  model,  QoS  model,  service  requests, 
request  translation  and  generation,  negotiation  and  resource  allocation,  real-time 
monitoring,  adaptation  models  and  policies,  and  enactment.  Note  that  the  layered 
architecture  described  in  this  report  has  been  generalized  under  RTARM  to  a 
hierarchical  architecture. 
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The  following  sections  describe  the  layers  in  more  detail.  It  is  currently  assumed 
that  the  target  applications  are  sensor-based  multiple  pipelines.  Each  layer 
receives  a  service  request,  translates  it,  and  attempts  to  provide  that  service  by 
negotiating  for  the  services  provided  by  lower  layers.  Existing  already  admitted 
requests  might  have  to  be  squeezed  through  adaptation  to  release  enough 
resources  to  admit  new  requests. 


Resource 
Hierarchy 

Figure  5:  Layered  Resource  Management  Architecture 

Once  a  request  is  admitted  and  enacted,  real-time  monitoring  allows  the 
workloads  and  delivered  QoS  to  be  measured.  Adaptations  are  triggered  when 
the  delivered  QoS  falls  outside  acceptable  threshold  regions.  As  described  in  the 
detailed  sections,  there  is  commonality  among  the  possible  adaptations.  The 
Enactment  component  is  responsible  for  bringing  adaptations  into  effect. 


Figure  6:  Resource  Management  Components  in  Every  layer 

Figure-6  shows  the  main  components  in  each  layer.  The  {R,  A}  arrows  indicate 
the  flow  of  service  request  and  monitored  or  actual  {QoS,  Workload}  respectively 


across  layers. 


5.1  Mission-Level  RM 

A  mission  is  a  set  of  applications,  some  of  which  may  interact  with  one  another 
The  set  is  dynamic  as  applications  may  arrive  and  depart  dynamically.  RM  for 
applications  is  viewed  in  the  context  of  a  global  miss.on-w.de  objective 
Temporally,  a  mission  may  have  several  phases,  with  possibly  different 
objectives  and  constituent  applications.  Transition  between  phases  may  e 
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triggered  by  any  of  the  general  trigger  conditions  considered  in  ARM,  i.e., 
operator  action,  detected  failure  to  meet  current  objective,  etc. 

Services  provided  by  the  M-Layer 

The  M-layer  manages  the  underlying  resources  in  such  a  manner  as  to  meet 
mission-level  objectives.  This  service  may  be  viewed  as  being  provided  to  a 
mission  (rather  than  to  individual  applications).  QoS  parameters  associated  with 
the  service  are  chosen  to  represent  the  mission-level  objectives.  An  example  of  a 
mission-level  objective  is  to  maximize  the  overall  value  of  the  application  set. 

5.2  Application-Level  RM:  A-Layer 

Applications  may  have  different  programming-  or  computational  models.  For  this 
effort,  an  application  is  a  multiple  pipeline,  with  possibly  a  reconfigurable 
structure,  as  described  in  Section  4.  The  A-layer  does  not  understand  missions, 
but  manages  resources  for  applications  to  meet  their  individual  QoS 
requirements. 

Services  provided  by  the  A-Layer 

The  A-layer  provides  resource  management  for  individual  applications  or  their 
components.  It  translates  the  incoming  service  request  and  QoS  requirements 
and  generates  requests  to  the  R-Layer  are  for  computational  services,  memory, 
and  network  services.  The  requests  may  be  made  for  each  service  separately,  or 
jointly.  For  example,  the  request  for  computation  and  memory  services  may  be 
made  together  if  the  A-Layer  wishes  to  constrain  the  allocation  to  be  co-located. 

The  A-layer  also  monitor  application-level  QoS  of  individual  applications,  which 
requires  computing  this  QoS  from  information  about  the  delivered  QoS  from  the 
R-layer.  Hence,  the  A-layer  must  at  least  monitor  the  actual  values  for  all 
components  in  the  application  representation.  Note  that  we  are  assuming 
composability  of  application-level  QoS  from  its  component-level  QoS,  which  is  a 
valid  assumption  for  the  multi-pipeline  applications. 

Adaptation  triggers  include  QoS  violation  of  entire  application  or  substructures, 
explicit  request  from  the  M-Layer,  and  detection  of  failure  in  lower  layers.  The  A- 
layer  decides  if  application-level  adaptation  is  needed.  Possible  adaptations  are: 

•  Adjust  the  requested  QoS  of  the  application  components  in  a  way  that 
does  not  violate  application-level  QoS  delivered  to  the  upper  layer.  Such 
adjustment  may  be  localized  to  a  subset  of  an  application  or  it  may  be 
application-wide. 

•  Without  changing  requested  QoS,  use  the  services  of  the  R-Layer  to 
perform  ARA-style  redistribution  of  already  allocated  resources.  This  is 
based  on  transfer  of  resources  from  application  components  experiencing 
better  than  requested  QoS,  to  components  with  worse  than  requested 
QoS. 
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5.3  Resource-Level  RM:  R-Layer 

The  R-layers  represent  resource  hierarchies.  In  general,  a  platform  consists  of 
computers  connected  over  networks.  Each  computer  may  be  a  Symmetric  Multi¬ 
processor  (SMP),  a  distributed  memory  massively  parallel  machine  (MPP),  or  a 
uniprocessor.  Networks  may  include  LAN’s  and  high-performance  interconnects 
providing  shared  memory. 

A  SMP  consists  of  processors  and  memory  shared  among  all  processors.  A 
distributed  memory  MPP  consists  of  MPP-nodes  connected  by  a  MPP-network, 
where  MPP-nodes  consist  of  one  or  more  processors  and  memory  shared 
between  them.  A  workstation  consists  of  a  processor  and  memory. 

Service  provided  by  the  R-Layer 

The  R-Layer  manages  computing,  network  and  memory  resources  for  whole  or 
subsets  of  multi-pipeline  structured  applications.  The  R-Layer  does  not 
understand  applications,  although  can  do  application-wide  resource  management 
when  the  pipeline  structure  submitted  to  it  is  for  an  entire  application.  The  QoS 
parameters  in  the  request  from  the  A-Layer  are  those  associated  with  multi¬ 
pipeline  application  components  (e.g.  nodes,  arcs)  and  structures. 

Requests  to  the  R-Layer  are  for  computational  services,  memory,  and  network 
services.  The  requests  may  be  made  for  each  service  separately,  or  jointly.  For 
example,  the  request  for  computation  and  memory  services  may  be  made 
together  if  the  A-Layer  wishes  to  constrain  the  allocated  resources  to  be  co¬ 
located. 

The  R-layer  translates  incoming  service  request  QoS  parameters  to  QoS 
parameters  for  individual  processors  and  links,  for  example  in  the  case  of  MPP  s. 
The  R-Layer  monitors  the  delivered  performance  and  performs  low-level 
adaptation.  As  in  all  layers,  adaptation  triggers  include  QoS  violation,  and  explicit 
request  from  the  A-Layer. 

5.4  Architecture  Evolution 

As  mentioned  earlier,  the  layered  architecture  described  in  Section  4  has  been 
generalized  into  a  hierarchical  architecture  for  resource  management.  As 
applications  are  built  on  top  of  services  and  services  may  be  built  on  top  of  lower 
level  services,  resource  management  for  the  entire  system  is  viewed  as  a 
hierarchy  of  service  managers.  Each  node  in  the  hierarchy  can  provide  support 
for  admission  control,  QoS  translation,  resource  allocation,  real-time  monitoring, 
adaptation  and  enactment.  The  attached  paper  "Hierarchical  architecture  for  real¬ 
time  adaptive  resource  management"  describes  the  generalized  RTARM 
architecture. 
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6.  Real-Time  Instrumentation 

We  used  the  Honeywell  Scalable  Programmable  Instrumentation  (SPI)  system 
for  real-time  monitoring.  SPI  offers  the  capability  of  monitoring  a  heterogeneous 
system  in  terms  of  traditional  metrics  such  as  latency  and  execution  times,  as 
well  as  metrics  that  depend  on  application  semantics.  Compared  with  other 
monitoring  approaches,  SPI  allows  the  construction  and  evaluation  of  arbitrary 
detectors  using  predefined  as  well  as  user-defined  actions  and  it  also  allows 
distributed  coordination  of  all  instrumentation  activity  and  data. 

Under  this  effort,  we  extended  SPI  in  several  ways  -  extensions  to  accommodate 
dynamically  arriving  and  departing  applications,  and  integration  with  the  resource 
management  system. 

Georgia  Tech  used  their  Falcon  system  for  real-time  monitoring.  Falcon  can 
detect  significant  changes  in  a  number  of  performance  metrics.  These  monitors 
produce  instrumented  streams  of  sampled  parameter  values.  Sample  parameters 
include  subtask  execution  time,  subtask  communication  time,  communication 
volume,  input  frame  rates,  and  other  measures  of  application  performance  or 
resource  utilization.  It  is  also  possible  to  monitor  application-specific  measures 
such  as  the  frequency  of  specific  message  types,  access  patterns  to  internal  data 
structures  or  any  other  measure  that  is  representative  of  the  application's 
resource  usage.  Detectors  operate  on  these  streams  to  produce  detection  events 
corresponding  to  potentially  significant  deviations  in  performance  guarantees. 

7.  ARM  Run-Time  System 

This  section  describes  implementation  of  the  main  components  of  the  ARM  run¬ 
time  system,  including  the  ARM  Layers,  Multi-Application  Infrastructure,  and  the 
ARM  Control  Infrastructure. 

7.1  Multi-Application  Infrastructure 

The  core  of  this  implementation  is  an  infrastructure  to  control  the  processes  of  an 
MPI  application  dynamically  by  shrinking  and  expanding  the  number  of 
processes  in  a  graceful  manner.  We  used  the  LAM  version  of  MPI  because  of  the 
dynamic  process  spawning  capability  that  it  provides  to  the  user.  This 
infrastructure  contains  a  two-level  resource  manager  system  (system  resource 
manager  and  application  resource  manager). 

This  infrastructure  allows  multiple  applications  to  co-exist  on  the  system  under 
the  control  of  a  system  resource  manager.  The  system  level  resource 
management  layer  is  between  application  level  resource  management  and  the 
operating  system(s).  The  objective  of  the  system  resource  manager  is  to 
continuously  monitor  and  keep  up  the  overall  performance  level  as  defined  by  the 
mission.  This  SW  architecture  is  as  shown  in  the  following  Figure.  It  enables: 

•  Applications  to  be  spawned  on  multiple  (distributed)  processors 

•  Applications  to  receive  a  given  QoS 
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-  Negotiation  between  the  resource  manager  and  the  application 

-  Dynamic  reconfiguration  of  the  number  of  application  processes  as  and 
when  the  need  arises 

•  Perform  dynamic  feedback  adaptation  operations  within  an  application 

ARM  assumes  that  the  application  programs  use  MPI  (not  necessarily  the  LAM 
version).  All  application  processes  must  call  the  ARM  initialization  procedure 
when  they  start  and  a  termination  procedure  when  they  exit. 

7.1.1.  ARM  Server 

We  implemented  a  central  ARM  server  (system  resource  manager)  and  built 
utilities  (and  APIs)  through  which  multiple  application  programs  can  execute  in  a 
controlled  manner.  Currently  the  server  provides  the  following  run-time  services: 

•  Admit  new  applications 

•  Expand  (grow)  current  application  in  size 

•  Shrink  current  application  in  size 

A  user  or  an  application  agent  can  request  these  services.  At  any  point  of  time, 
the  server  maintains  information  about  resources,  applications  and  the  binding  of 
resources  to  applications.  It  also  maintains  two  request  queues,  one  for  the 
currently  active  applications  and  one  for  newly  admitted  applications.  These 
queues  are  maintained  for  only  those  requests  that  require  new  (additional) 
resources.  For  example,  admit  and  expand  both  requires  resources.  In  the  case 
of  shrink,  the  request  is  handled  immediately.  Four  types  of  triggers  invoke  the 
scheduler. 

•  After  an  application  admission 

•  After  an  application  departure 

•  After  an  application  shrinkage 

•  After  an  application  expansion  request 

Currently,  we  have  a  simple  FCFS  scheduler  that  first  considers  the  queue  for 
the  active  applications,  and  then  considers  the  queue  for  new  applications  for 
scheduling. 

7.1.2.  ARM  Agent 

ARM  Agent  is  spawned  automatically  by  the  ARM  server.  Currently,  for  every 
application,  the  server  spawns  one  agent,  which  in  turn  spawns  the  application 
processes.  We  also  implemented  synchronization  protocols  for  information 
exchange  among  the  ARM  Server,  ARM  Agents,  and  the  application  processes. 

Application  Growth:  The  growth  of  an  application  (in  terms  of  number  of 
processes)  can  be  initiated  either  by  the  server  or  by  the  agent.  We  have  defined 
two  types  of  protocols  for  their  synchronization  -  a  synchronous  protocol  and  an 
asynchronous  protocol.  In  the  synchronous  protocol,  the  server  issues  a  grow 
command  to  the  agent,  which  then  informs  all  the  application  processes.  If  the 
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agent  gets  back  acknowledgement  from  all  of  the  application  processes  within  a 
certain  time,  it  sends  back  an  acknowledgement  to  the  server.  If  this 
acknowledgement  is  received  by  the  server  before  its  timer  expires,  it  send  a 
commit  message  to  the  agent  which  then  does  the  same  to  all  of  application 
processes  and  then  the  grow  process  takes  place.  In  case  the  timer  expires  in 
either  the  server  or  the  agent,  they  issue  a  cancel  message  to  the  appropriate 
parties  immediately. 

In  the  asynchronous  protocol,  the  server  is  optimistic  and  sends  only  a  single 
message  to  the  agent  and  the  agent  is  responsible  to  inform  the  server 
asynchronously  about  the  success  or  failure  of  the  operation.  If  the  message 
does  not  reach  the  agent  due  to  any  reason,  the  server  learns  that  only  when  the 
application  departs. 

Application  Shrinkage:  As  with  application  growth,  shrinking  can  be  initiated 
either  by  the  server  or  by  the  agent.  If  an  agent  is  the  initiator,  it  asynchronously 
informs  the  server,  which  then  makes  an  update.  If  the  server  is  the  initiator,  it 
goes  through  the  protocols.  For  shrink  we  have  implemented  two  protocols 
similar  to  those  for  application  growth. 

7.1.3.  ARM  Control  infrastructure 

This  implementation  consists  of  the  ARM  layers  for  admission  and  adaptation 
control.  This  package  consists  of  several  integrated  modules  -  admission  control, 
real-time  monitoring,  and  feedback  adaptation.  The  ARM  layers  (A-Layer  and  the 
R-layers)  are  bundled  as  a  single  library  package  used  by  a  centralized  ARM 
controller  for  admitting  new  applications.  The  new  applications  request  the 
service  through  an  ARM  server.  The  ARM  controller  was  implemented  as  an 
Event-Action  machine  of  the  SPI  (Scalable  Programmable  Instrumentation) 
system,  which  was  extended  to  handle  dynamic  arrival  of  the  processes  to  be 
monitored.  The  control  software  architecture  is  as  shown  in  the  following  figure: 


Figure  7:  ARM  Control  Infrastructure 

To  start  ARM,  the  LAM  daemon  is  started  by  the  user  with  the  required  hardware 
configuration.  The  user  then  starts  the  SPI  loader,  which  starts  the  SPI  main  EA 
machine,  the  ARM  controller,  and  the  other  required  SPI  EA  machines.  In  the 
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current  implementation,  there  is  a  single  ARM  controller  instance  and  multiple 
monitor  instances.  Each  ARA  monitor  is  associated  with  the  set  of  application 
processes  and  a  SPI  real-time  display. 

To  start  an  application,  a  user  uses  a  client  utility,  which  establishes  connection 
with  the  ARM  controller  and  forwards  the  user's  request  for  admission,  shrinkage 
or  expansion.  A  new  initialization  protocol  is  added  to  all  application  processes  to 
facilitate  communication  from  the  ARM  controller  to  the  application  processes. 
This  protocol  requires  application  processes  to  establish  a  socket  connection  to 
the  ARM  controller.  The  application  processes  then  continuously  look  for 
remapping  messages  from  the  ARM  controller  on  this  socket  during  execution. 

7.2  ARM  Layers 

The  control  of  a  layer's  functionality  is  embedded  within  a  manager  for  that  layer. 
These  layer  managers  are  responsible  for  allocation  of  resources  and  adaptation, 
using  the  services  of  the  lower  layer  wherever  necessary. 

7.2.1.  A-Layer 

In  the  ARM  implementation,  the  interface  to  the  A-layer  is  through  an  object 
(class)  called  AppManager.  The  AppManager  manager  is  a  specialization  of  the 
Manager  class.  It  contains  objects  such  as  the  negotiator,  allocator,  enactor, 
detector  and  adaptor.  The  AppManager  also  has  a  reference  to  the  R-layer 
manager  (ResManager).  This  reference  is  created  during  the  instantiation  of  the 
AppManager.  The  application  (task)  requests  service  from  the  A-layer  using  the 
method  TestAndHold()  of  the  AppManager  object.  The  AppManager  assigns  an 
ID  (task  id),  then  uses  its  negotiator  object  to  request  appropriate  service  from 
the  R-layer  (since  the  A-layer  by  itself  does  not  have  resources). 

The  Negotiator  translates  the  application  request  into  one  that  is  understood  by 
the  R-layer.  This  translation  is  called  the  forward  translation  and  it  involves 
translating  task  structures  along  with  workloads  and  QoS.  After  translation,  the 
negotiator  makes  a  request  to  the  R-layer  manager  through  the  reference 
maintained  by  the  AppManager.  Once  the  request  returns,  the  negotiator 
translates  back  the  assigned  QoS  into  the  one  understood  by  the  A-layer 
(backward  translation).  After  this,  the  control  passes  back  to  the  AppManager, 
which  then  reviews  the  returned  QoS  and  returns  it  to  the  requesting  task  along 
with  a  task  identifier.  Further  interaction  between  the  requester  and  the 
AppManager  takes  place  through  the  following  methods  using  the  assigned  task 
id:  Reserve  (),  Release  (),  Abort  ().  Whenever  resources  are  allocated,  the 
AppManager  maintains  the  task  structures  corresponding  to  the  two  layers  along 
with  their  task  ids  and  the  resource  allocation  (QoS  allocation)  info  in  a  hash 
table  indexed  by  the  task  id.  This  is  part  of  the  TestAndHold  ()  method.  The  task 
model  understood  by  the  A-layer  is  the  App’  class,  which  inherits  from  both 
‘Task’  class  and  the  ‘Graph’  class. 
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7.2.2.  R-Layer 

The  entry  point  for  this  layer  is  the  ResManager  object.  It  contains  handles  to  the 
actual  resources  (managed  by  appropriate  managers).  The  purpose  of  this  object 
is  strictly  to  provide  a  body  for  embedding  the  main  control  loop  of  the  R-layer. 
The  A-layer  requests  service  from  this  object  using  the  method  TestAndHold  (). 
The  task  model  in  R-layer  is  called  the  execution  graph  and  is  represented  as  a 
class  named  ExecGraph.  The  ResManager  calls  the  negotiator  of  this  layer  to 
make  request  to  the  actual  resource  managers. 

7.3  ARM  Controller 

The  ARM  controller  is  responsible  for  admission  control  and  starting  of 
application  processes.  Once  applications  processes  are  started,  they  send 
performance  information  to  the  ARM  controller  through  SPI  channels  established 
during  initialization  phase  of  the  application  process.  Depending  on  the 
application  id  (which  is  assigned  by  the  ARM  controller)  of  the  process  that  is 
sending  data,  the  performance  data  is  routed  to  an  appropriate  monitor.  Each 
application  is  assigned  one  monitor.  When  the  ARA  monitor  decides  to  remap  an 
application,  it  sends  the  new  mapping  to  the  ARM  controller  for  that  application. 
For  this  purpose,  it  uses  a  TCP/IP  channel  established  between  itself  and  all  the 
application  processes  as  part  of  the  application  initialization  protocol. 

8.  Demonstrations 

We  developed  three  demonstrations  for  this  project.  The  first  demonstration  was 
given  in  October  1997  on  a  network  of  Sun  Solaris  machines  as  shown  in  the 
Figure  below.  It  showed  QoS-based  admission  control  and  dynamic  resource 
allocation  for  multiple  synthetic  sensor-based  MPI  applications. 


16 


The  vertical  slice  implementation  included  -  a)  a  layered  architecture  for 
management  of  processor  resources,  b)  admission  control  including  QoS 
translation,  and  c)  dynamic  reconfiguration  based  on  feedback  of  actual  QoS 
(real-time  monitoring,  detection,  and  reallocation)  within  individual  applications. 

Georgia  Tech  contributed  two  demonstrations  focusing  on  the  utility  of  the 
technology  and  techniques  developed  in  this  project.  They  developed  several 
additional  applications  with  the  objective  of  demonstrating  specific  levels  of 
improvement. 

Decision  Models 

Experiments  using  a  synthetic  workload  generator  and  the  statically  defined 
decision  model  parameters  yielded  promising  results.  With  the  Bayesian  decision 
model,  we  realized  an  overall  reduction  in  unsuccessful  invocations  of  the  cost 
evaluator  and  number  of  unnecessary  resource  reallocations.  This  allowed  more 
cycles  for  useful  computation  and  masked  the  use  of  the  more  complex 
Markovian  decision  process.  Experiments  with  frame  latency  showed  similar  or 
improved  performance  compared  with  the  simple  decision  model  for  a 
significantly  lower  number  of  remappings. 

Integration  of  the  reactive  Bayesian  model  with  the  predictive  Markovian  model 
improved  latency  and  reduced  false  reallocations  under  a  variety  of  input 
conditions.  Under  average  input  conditions,  both  models  contributed  to 
decreasing  end-to-end  latency  and  reducing  the  decision  and  enactment 
overhead.  The  Bayesian  model  proved  better  in  high  noise  environments  and  the 
Markovian  model  proved  better  in  low  noise  environments.  In  these  situations, 
the  less  suited  model  provided  good  backup  support  for  the  more  effective 
model.  Under  high  noise  conditions,  the  Bayesian  level  filtered  a  much  larger 
percentage  of  input  spikes  while  the  Markovian  level  ensured  that  performance 
did  not  fall  below  the  real-time  specifications. 

Vision  Application 

Georgia  Tech  evaluated  some  of  their  adaptation  techniques  on  a  vision 
application  called  Pfinder.  The  application  consisted  of  a  camera  function  X- 
interface  handler,  and  image  processing  functions.  Adaptation  was  performed  by 
reconfiguring  the  application  mapping  based  on  on-line  monitoring  of  data  flow 

rates. 

Evolution  of  Demonstrations 

Since  the  merger  of  this  project  with  Real-Time  Adaptive  Resource  Management 
(RTARM)  in  1997,  we  targeted  our  demonstrations  to  the  new  hierarchical 
resource  management  architecture.  A  description  of  the  technical  features  of  the 
demonstrations  is  given  in  the  attached  papers. 
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9.  Publications 

The  following  publications  based  on  this  work  are  attached.  Some  of  the 
publications  describe  research  derived  only  partly  from  this  project,  and  contain 
the  results  of  subsequent  continuing  work. 

•  D.  Ivan  Rosu,  K.  Schwan,  S.  Yalamanchili,  and  R.  Jha,  "On  adaptive 
resource  allocation  for  complex  real-time  applications",  in  Proceedings  of 
the  18th  IEEE  Real-Time  Systems  Symposium,  San  Francisco,  December 
1997. 

•  D.  Paul,  S.  Yalamanchili,  K.  Schwan,  and  R.Jha,  "Decision  models  for 
adaptive  resource  management  in  multiprocessor  systems". 

•  M.  Cardei,  I.  Cardei,  R.  Jha,  and  A.  Pavan,  “Hierarchical  Feedback 
Adaptation  For  Real  Time  Sensor-based  Distributed  Applications” 

•  I.  Cardei,  R.  Jha,  M.  Cardei,  and  A.  Pavan,  “Hierarchical  Architecture  For 
Real-Time  Adaptive  Resource  Management”. 


18 


On  Adaptive  Resource  Allocation  for  Complex  Real-Time  Applications* 


Daniela  Ivan  Ro§u,  Karsten  Schwan, 
Sudhakar  Yalamanchili 
Georgia  Institute  of  Technology 
801  Atlantic  Drive,  Atlanta,  GA  30332-0208 
{  daniela, schwan  }@cc.gatech.edu 

sudhakar.yalamanchili@ee.gatech.edu 


Rakesh  Jha 

Honeywell  Technology  Center 
3660  Technology  Drive 
Minneapolis,  MN-55418 
j  ha  @  src  .honey  well  .com 


Abstract 

Resource  allocation  for  high-performance  real-time  ap¬ 
plications  is  challenging  due  to  the  applications  data- 
dependent  nature,  the  dynamic  changes  in  their  external 
environment,  and  the  limited  resources  available  of  the 
embedded  systems  on  which  they  run.  These  challenges 
may  be  met  by  use  of  Adaptive  Resource  Allocation  (ARA) 
mechanisms  that  can  promptly  adjust  resource  allocation  to 
changes  in  applications  ’  resource  needs,  whenever  there  is  a 
risk  of  failing  to  satisfy  the  application ’s  timing  constraints. 
Although  not  decided  by  the  application,  such  adjustments 
satisfy  the  application ’s  adaptation  capabilities.  ARA  elim¬ 
inates  the  need  for  ‘over-sizing’  real-time  systems  to  meet 
worst-case  application  needs.  This  paper  proposes  an  ap¬ 
plication  model  used  to  describe  the  application  s  resource 
needs  and  its  adaptation  capabilities.  The  model  also  de¬ 
scribes  the  runtime  variation  of  application  needs.  The  pa¬ 
per  also  proposes  a  satisfiability-driven  set  of  performance 
metrics  for  capturing  the  impact  of  ARA  mechanisms  on  the 
performance  of  real-time  applications.  The  relevance  of  the 
proposed  metrics  set  is  demonstrated  experimentally,  us¬ 
ing  an  adaptive,  synthetic  application  designed  to  represent 
time-critical  applications  in  C:'J  systems. 


1.  Introduction 

Motivation.  The  resource  management  problems  for  real¬ 
time  and  embedded  applications  are  exacerbated  by  the  dy¬ 
namic  changes  in  their  external  environment  and  by  the 
restrictions  on  resource  availability.  One  commonly  used 
solution  is  the  worst-case  resource  allocation.  In  many 
cases  this  is  not  a  realistic  option  because  of  the  exceedingly 
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high  resource  estimates  resulted  from  complex  interactions 
among  the  application  components.  If  static  resource  al¬ 
location  is  not  viable,  adaptive  methods  must  be  used  to 
adjust  resource  allocation  to  changes  in  the  application  s 
needs,  therefore  reducing  the  likelihood  of  failing  to  meet 
its  real-time  constraints. 

Contributions.  This  paper  describes  and  evaluates  models 
and  mechanisms  for  Adaptive  Resource  Allocation  (ARA) 
in  the  context  of  high  performance,  embedded  applications. 
We  consider  applications  with  data-dependent  execution, 
driven  by  event  streams,  composed  by  multiple,  possibly 
parallel  interacting  components.  Runtime  changes  in  event 
rates  and  more  importantly,  in  the  data  content  of  these 
events  cause  important  changes  in  the  resource  needs  of  var¬ 
ious  application  components.  For  such  applications,  it  is 
simply  not  feasible  to  model  accurately  the  per-event  pro¬ 
cessing  and  communication  needs.  This  class  of  applica¬ 
tions  includes  radar  systems  [26],  robots  [7,  35,  39],  target 
recognition,  multi-object  tracking,  hypothesis  testing  [25]. 

ARA  mechanisms  can  be  used  to  promptly  adjust  re¬ 
source  allocation  to  changes  in  applications  resource  needs, 
whenever  there  is  a  risk  of  failing  to  satisfy  the  application’s 
timing  constraints.  Although  not  decided  by  the  applica¬ 
tion,  these  adjustments  satisfy  its  adaptation  capabilities 
and  eliminate  the  need  for  ‘over-sizing’  real-time  systems 
to  meet  worst-case  application  needs. 

This  paper  describes  a  novel  model  for  capturing  an  ap¬ 
plication’s  adaptation  capabilities  by  specifying  the  resource 
needs  corresponding  to  each  acceptable  configuration.  In 
addition,  the  model  permits  to  capture  the  runtime  varia¬ 
tion  of  the  resource  needs  caused  by  unexpected  changes  in 
application  behavior. 

Given  the  real-time  nature  of  the  applications  targeted  by 
this  research,  we  propose  to  evaluate  the  ARA  mechanisms 
by  their  impact  on  the  satisfiability  of  the  applications’  real¬ 
time  constraints.  Specifically,  we  submit  that  it  is  essential  to 
consider  the  latencies  with  which  ARA  mechanisms  respond 
to  changes  in  application  needs  when  attempting  to  restore 
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the  satisfiability  of  real-time  constraints.  The  quality  of 
ARA  decisions  is  evaluated  with  respect  to  how  fast  the 
application  can  return  to  acceptable  performance  and  how 
good  the  performance  in  steady  state  is  compared  to  the 
levels  imposed  by  applications’  real-time  requirements. 

In  this  study  we  identify  elements  that  contributed  to  the 
effectiveness  of  ARA  methods  and  heuristics.  More  specifi¬ 
cally,  we  experimentally  show  the  effects  of  early  detection, 
enactment  overhead,  and  incremental  reallocation  heuris¬ 
tics.  Assumptions  and  Experimental  Environment.  In 
this  work  we  assume  that  a  multi-machine  environment  is 
destined  to  a  single,  complex  application.  As  a  result,  perfor¬ 
mance  perturbations  are  produced  only  by  dynamics  in  the 
application’s  external  environment  or  by  changes  in  resource 
availability  due  to  failures  or  explicit  removals/additions. 
We  also  assume  the  explicit  use  of  admission  control  mech¬ 
anisms  to  guarantee  sufficient  resources  to  meet  an  applica¬ 
tion’s  initial  required  performance  levels. 

The  models  and  heuristics  proposed  here  are  evaluated  in 
the  context  of  a  centralized  ARA  controller.  Online  moni¬ 
toring  is  performed  with  mechanisms  described  in  [14].  Ex¬ 
periments  are  conducted  with  a  synthetic  application  run¬ 
ning  on  a  cluster  of  workstations.  The  application  is  de¬ 
signed  by  Honeywell  in  the  context  of  high  performance 
C3/1  applications[25]. 

Related  research.  Previous  work  has  described  frameworks 
and  mechanisms  that  facilitate  the  creation  and  use  of  online 
adaptation  heuristics  for  real-time  applications  [5,  18,  22], 
including  mechanisms  for  runtime  monitoring,  adaptation 
enactment,  and  mechanisms  that  ensure  the  reliable  exe¬ 
cution  of  applications  [5,  22]  or  maintain  high  application 
throughput  [18].  In  comparison,  the  focus  of  this  paper  is 
not  to  define  new  frameworks,  but  instead,  to  define  models 
and  methods  to  be  used  in  such  frameworks  and  to  analyze 
their  effect  on  the  adaptive  applications. 

Extensive  research  has  addressed  the  problem  of  dynamic 
resource  allocation  for  both  the  real-time  [1,  3,  4,  9,  15,  17, 
3 1 . 40]  and  the  non-real-time  [  1 3,  23,  27,  34]  domains,  typ¬ 
ically  considering  dynamic  resource  allocation  in  the  con¬ 
text  of  load  balancing.  However,  the  methods  developed  in 
these  studies  do  not  fit  our  target  application  model.  This 
is  because  our  model  assumes  that  the  resource  needs  of 
a  time-constrained  task,  even  when  generated  by  the  same 
type  of  event  may  vary  throughout  the  execution  of  the  ap¬ 
plication.  This  variability  prevents  us  from  using  a  periodic 
task  model  [15,  17]  in  which  performance  requirements  are 
fixed  throughout  an  application’s  execution,  and  therefore 
worst-case  needs  have  to  be  considered.  It  also  prevents 
us  from  using  a  sporadic  task  model,  as  in  the  real-time 
[9,  31,  40]  or  the  non-real-time  [13,  34,  23]  domains,  be¬ 
cause  of  the  high  overhead  of  taking  resource  allocation 
actions  at  each  task  arrival.  In  addition,  the  specification  of 
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a  real-time  parallel  task,  as  needed  for  an  application  com¬ 
ponent,  is  either  too  complex  -  in  the  real-time  models,  or 
incomplete  -  in  the  not-real-time  models,  because  it  does 
not  describe  the  interaction  among  the  parallel  models  of 
the  same  component. 

Resource  reallocation  triggered  by  runtime  variation  of 
application  needs  has  received  less  attention.  The  schemes 
proposed  for  both  real-time  [4, 32, 1 7]  and  non-real-time  [  1 8, 
23,  27,  38]  domains  do  not  consider  the  transitory  effects  of 
reallocation  mechanisms  on  the  satisfiability  of  application’s 
performance  constraints.  In  contrast,  they  are  primarily 
interested  in  using  adaptations  to  attain  optimal  average- 
case  performance. 

Overview  of  paper.  In  the  remainder  of  this  paper,  we 
first  identify  the  application  and  the  ARA  model  driving  our 
research  (Section  2).  In  Section  3,  we  describe  two  impor¬ 
tant  components  of  the  application  model  used  for  ARA: 
the  application  resource  usage  model  and  the  application 
adaptation  model.  In  Section  4  we  identify  specific  ARA 
performance  criteria  derived  from  the  real-time  nature  of 
our  target  application.  Last,  in  Section  5,  we  demonstrate 
by  experiments  the  relevance  of  these  criteria  and  identify 
methods  that  help  improve  ARA  performance. 

2.  Real-Time  Applications  and  ARA 

Application  Model.  Our  research  targets  reactive,  high  per¬ 
formance  applications  that  must  meet  well-defined  real-time 
constraints  in  dynamic  execution  environments.  Each  such 
application  consists  of  multiple  interacting  components  ca¬ 
pable  of  executing  in  a  distributed  environment  consisting 
of  parallel  machines,  embedded-system  components  (e.g., 
signal  processors),  and  user  interface  stations  (e.g.,  work¬ 
stations).  Components  are  either  sequential  or  parallel  tasks 
and  their  resource  needs  may  be  data-dependent  varying 
with  changes  in  the  rate  or  content  of  data  inputs.  In  re¬ 
sponse,  many  components  are  programmed  such  that  they 
can  adapt  their  resource  needs  at  runtime,  by  changes  in 
their  execution  mode,  algorithms  or  specific  attributes  such 
as  the  level  of  parallelism  or  communication  protocols. 

An  application’s  execution  is  driven  by  event  streams 
produced  by  the  external  environment  or  application  com¬ 
ponents.  Each  event  stream  is  processed  by  a  fixed  set  of 
components,  with  fixed  precedence  constraints  described  by 
a  communication  graph.  The  input  pattern  of  a  stream  may 
vary  with  changes  in  the  execution  environment.  We  use 
the  term  intra-communication  to  name  the  communication 
among  parallel  modules  of  the  same  application  compo¬ 
nent,  and  the  term  inter- communication  to  name  the  com¬ 
munication  between  the  component  and  its  neighbors  in  the 
communication  graph.  We  assume  that,  for  each  event,  the 
intra-communication  happens  throughout  the  event  process¬ 
ing  while  the  inter-communication  happens  in  a  burst  at  the 
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end  of  the  source  component  computation. 

The  application’s  performance  requirements  are  defined 
by  constraints  with  respect  to  event  rate,  end-to-end  latency, 
and  inter-component  relative  completion  delays.  Each  tim¬ 
ing  constraint  may  have  specific  bounds  on  its  miss  rate 
and/or  burst. 


Ridar  input  -  L_l  Sensor/actuatoi 

Missile  tracing  C'  3  App.  component 

Missile  control  -  -  - 


Figure  1.  Radar  Application 

Sample  Application.  One  sample  application  driving  this 
research  is  a  radar  system.  Figure  1  presents  part  of  such  a 
system,  as  described  in  [26],  Detection ,  Track  Init  and  Track 
Identifarz  computation-intensive  tasks,  each  well  suited  for 
parallel  implementation  [25].  Over  time,  their  processing 
and  communication  needs  vary  with  the  number  and  char¬ 
acteristics  (e.g.,  amplitude,  direction)  of  dwells.  Given  the 
nature  of  their  computation  [25],  these  tasks  can  adapt  by 
changes  in  their  levels  of  parallelism. 

The  main  event  streams  in  the  radar  system  are  (1)  the 
input  from  the  radar,  (2)  the  input  from  the  missile  tracking 
device,  and  (3)  the  missile  control  requirements.  Timing 
constraints  concern  necessary  event  rates  and  processing 
latencies.  For  instance,  the  rate  of  the  radar  input  is  1 500Hz, 
and  the  missile  control  events  must  be  sent  at  a  rate  of 
4Hz.  Additional  constraints  are:  a  0.2  second-bound  on  the 
latency  between  Detect- ing  a  potential  missile  and  engaging 
Search  Control ,  and  0.5  seconds  bound  on  the  execution  of 
Engage. 

The  radar  system  is  one  of  the  many  applications  con¬ 
cerned  with  processing  signals  from  a  sensor  suite,  forming 
hypothesis  about  and  assessing  the  situation,  and  taking 
an  appropriate  response  based  on  data  observed  and  pro¬ 
cessed  over  a  period  of  time.  Other  examples  are  multi¬ 
hypotheses  tracking  and  image  understanding^].  Often 
the  front  end  of  these  applications  consist  of  signal  process¬ 
ing  stages  whose  computational  needs  are  predictable,  as 
they  are  independent  of  the  signal  values.  However,  com¬ 
putations  at  the  back  end  depend  on  the  semantic  content  of 
the  signal  values,  being  often  heavily  data-dependent. 
Specific  Resource  Allocation  Problems.  The  application 
model  presented  above  poses  interesting  resource  allocation 
problems.  First,  the  event-stream-based  execution  makes 
viable  the  option  of  using  long  term  resource  allocation. 
Alternatively,  a  short  term  resource  allocation  based  on  dy¬ 
namic  real-time  scheduling  decisions  [31,  3,  40],  is  prone 
to  add  a  too  much  overhead  to  each  event  processing,  in 


particular  because  the  application  components  might  often 
be  parallel  tasks  executing  in  a  distributed  environment. 

Second,  the  worst-case  based  allocation,  the  typical  ap¬ 
proach  used  in  complex  real-time  systems,  might  not  be 
appropriate  for  any  application  in  our  targeted  class.  In  the 
context  of  data-dependent  resource  needs,  it  might  be  very 
difficult  to  evaluate  the  worst-case  needs  with  enough  accu¬ 
racy  to  ensure  both  a  safe  execution  and  acceptable  resource 
utilization.  For  example,  in  the  radar  system  (see  Figure  1), 
Track  Init  has  very  data-dependent  needs  as  they  vary  with 
the  number  of  dwell  returns  above  a  selected  threshold  and 
the  ambiguity  of  spurious  tracks.  Thereby,  the  worst-case 
needs  depended  on  the  worst-case  execution  scenario,  which 
makes  them  hard  to  evaluate  and  possibly  very  large  com¬ 
pared  to  the  needs  of  a  typical  execution  scenario. 

Our  solution  to  these  problems  is  to  use  adaptive  re¬ 
source  allocation  (ARA).  By  taking  advantage  of  the  appli¬ 
cation’s  adaptation  capabilities,  this  method  permits  using 
long-term  resource  reservations  while  accommodating  run¬ 
time  changes  in  resource  needs. 

Adaptive  Resource  Allocation.  ARA  is  a  resource  man¬ 
agement  paradigm  that  takes  advantage  of  an  application  s 
ability  of  runtime  adaptation  in  order  to  accommodate  dy¬ 
namic  resource  needs  and  to  satisfy  the  system  goals  with 
respect  to  performance  and  resource  utilization.  In  the  con¬ 
text  of  our  target  application  model,  the  goal  of  ARA  is  to 
insure  that,  at  any  time,  the  performance  requirements  of  the 
application  are  satisfied. 

In  our  approach,  the  ARA  infrastructure  can  satisfy  two 
types  of  resource  requests:  explicit  and  implicit.  An  ex¬ 
plicit  request  is  issued  by  the  application  upon  a  component 
arrival  to  the  system,  or  whenever  the  application  deems 
necessary  to  adjust  its  resource  usage.  An  implicit  request 
is  issued  by  the  ARA  infrastructure  itself,  when  changes  in  a 
component’s  resource  needs  considerably  increase  the  like¬ 
lihood  of  failing  to  satisfy  of  the  application’s  performance 
requirements. 

The  implicit  requests,  and  sometimes  also  the  explicit 
ones,  are  satisfied  by  adjustments  of  the  resource  allocation 
of  one  or  more  application  components  decided  by  the  ARA 
infrastructure  itself.  Such  adjustments  are  called  automatic 
because  they  are  not  explicitly  required  by  the  application. 
They  are  performed  only  when  otherwise  the  performance 
constraints  of  the  application  are  very  likely  to  be  violated, 
and  they  observe  strictly  the  application/component  specific 
adaptation  capabilities.  For  example,  an  automatic  adjust¬ 
ment  might  be  performed  when,  due  to  the  lack  of  resources 
in  the  system,  a  new  application  component  can  not  be  ac¬ 
commodated  unless  the  allocation  of  other  components  is 
reduced.  Similarly,  an  automatic  adjustment  can  be  trig¬ 
gered  by  an  unexpected  change  in  the  execution  environ¬ 
ment  that  causes  a  change  in  the  resource  needs  that  can  not 
be  accommodated  in  the  current  configuration.  For  exam- 
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pie,  a  change  in  the  content  of  the  input  data  may  cause  an 
increase  of  event  processing  time  for  a  particular  compo¬ 
nent  that  would  require  extending  the  component’s  level  of 
parallelism  in  order  to  keep  with  the  event  rate. 

In  an  alternative  approach[5],  the  resource  management 
infrastructure  can  satisfy  only  explicit  requests,  but  it  can 
provide  the  application  with  information  on  its  observed 
resource  usage.  The  resource  usage  adjustment  decisions 
are  made  by  the  application  itself. 

In  contrast,  our  automatic  adjustments  based  approach 
permits  to  move  part  of  the  burden  of  the  adaptation  de¬ 
cisions  from  the  application  to  the  resource  management 
infrastructure.  A  similar  approach  is  taken  in  [18,  17]  and, 
also,  in  our  previous  work  [32].  The  benefit  of  this  approach 
is  that  unexpected  changes  in  the  application's  resource 
needs  are  likely  to  receive  faster  response.  Compared  to 
the  application,  the  resource  management  infrastructure  has 
faster  access  to  all  the  information  related  to  the  resource 
availability  and  current  resource  usage  pattern  of  each  ap¬ 
plication  component.  In  addition,  the  application  overhead 
with  tracking  the  runtime  variation  of  its  requirements  is 
eliminated.  The  drawback  is  that,  compared  to  application- 
level  decisions,  the  ARA  decisions  may  fail  to  produce  the 
most  appropriate  resource  assignment  for  each  particular 
situation.  Likewise,  ARA  may  result  in  changes  in  resource 
allocations  not  necessary  for  the  good  performance  of  the 
application.  However,  the  models  and  mechanisms  em¬ 
bedded  in  an  ARA  infrastructure  can  help  minimize  these 
drawbacks. 
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Figure  2.  Centralized  ARA  controller 


In  order  to  achieve  its  functionality,  the  ARA  infrastruc¬ 
ture  should  include  mechanisms  for:  (1)  collecting  infor¬ 
mation  about  application  resource  usage  and  resource  avail¬ 
ability;  (2)  detecting  significant  variations  in  application 
resource  usage;  (3)  inferring  the  cause  of  observed  varia¬ 
tions  and  assessing  the  necessity  of  an  automatic  adjustment 
of  the  resource  usage;  (4)  making  decisions  about  resource 
assignments  and  automatic  resource  allocation  adjustments; 
(5)  notifying  the  application  about  significant  changes  in 
its  resource  usage;  (6)  notifying  the  application  and  the  re¬ 
source  providers  about  changes  in  resource  allocation  and 
assisting  them  in  the  enactment  of  these  changes.  We  assume 
that  each  application  component  capable  of  runtime  adap¬ 


tations  has  specific  reconfiguration  procedures  that  can  be 
triggered  by  notifications  of  reallocation  decisions  received 
from  the  ARA  infrastructure. 

The  ARA  functionality  is  based  on  knowledge  of  the  ap¬ 
plication  characteristics.  These  characteristics  are  described 
by  an  internal  application  model.  Besides  the  structure  of 
the  application  (components,  event  streams,  communica¬ 
tion  graphs)  and  its  performance  requirements,  the  model 
describes  for  each  application  component,  the  acceptable 
configurations  (i.e.,  those  instances  of  resource  allocation 
that  permit  it  to  perform  correctly)  and  the  runtime  varia¬ 
tion  of  resource  requirements.  The  model  is  used  for  the 
interpretation  of  monitored  information,  the  estimation  of 
system  performance  upon  changes  in  resource  allocation, 
and  the  guidance  of  decision  heuristics.  The  internal  appli¬ 
cation  model  is  importantly  influencing  the  way  the  ARA 
infrastructure  can  override  the  drawback  with  respect  to  the 
appropriateness  of  its  decisions,  and  the  execution  overheads 
of  the  ARA  mechanisms. 

The  performance  of  the  overall  ARA  infrastructure  and 
of  each  of  its  mechanisms  reflects  in  the  enabled  applica¬ 
tion  performance  not  only  by  how  appropriate  the  resource 
allocation  decisions  are  but  also  by  how  fast  the  ARA  in¬ 
frastructure  responds  to  unexpected  changes  in  application 
behavior.  A  short  response  time  helps  to  reduce  the  in¬ 
tervals  in  which  the  application  does  not  satisfy  its  timing 
constraints  and  to  remain  within  the  acceptable  miss  rate 
limits.  Delayed  ARA  decisions  or  decisions  that  take  too 
long  to  be  enacted  are  less  likely  to  reduce  the  risk  of  failing 
to  satisfy  the  application’s  timing  constraints. 

In  our  work,  the  ARA  functionality  is  provided  by  a 
module  called  ARA  controller .  This  module  can  have  a  dis¬ 
tributed  or  a  centralized  architecture.  Figure  2  depicts  a 
centralized  controller,  similar  to  the  one  used  in  our  exper¬ 
iments.  The  controller’s  interaction  with  the  application  is 
restricted  to  monitoring  and  reallocation  enactment. 

In  the  next  sections  we  will  address  the  internal  appli¬ 
cation  model  and  the  performance  evaluation  of  an  ARA 
infrastructure.  Both  these  issues  have  significant  impact  on 
how  the  ARA  can  help  an  adaptive  application  to  cope  with 
unexpected  changes  in  its  resource  usage  and  with  restriction 
in  resource  availability. 


3.  Internal  Application  Model 

This  section  describes  the  first  novel  contribution  of  our 
research.  We  propose  models  describing  the  application  re¬ 
source  usage  and  its  adaptation  capabilities,  both  part  of  the 
internal  application  model  maintained  by  the  ARA  infras¬ 
tructure: 

•  The  resource  usage  model  (RUM)  describes  an  appli¬ 
cation’s  expected  the  computational  and  communica¬ 
tion  needs  and  their  runtime  variation. 
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•  The  adaptation  model  (AM)  describes  an  applica¬ 
tion’s  acceptable  configurations  in  terms  of  expected 
resource  needs  and  application-specific  enacting  over¬ 
heads. 

The  RUM  is  used  in  the  ARA  decision  making  process 
to  evaluate  the  current  application's  resource  needs  and  to 
determine  how  the  performance  requirements  will  be  sat¬ 
isfied.  The  AM  permits  the  ARA  controller  to  decide  ap¬ 
propriate  resource  allocation  adjustments  without  incurring 
any  negotiation  overhead  as  it  is  the  case  with  other  resource 
management  solutions  that  support  runtime  adaptations!  19]. 
In  addition,  the  provision  for  estimations  of  the  adaptation 
overheads  permits  the  ARA  controller  to  understand  and 
evaluate  tradeoffs  between  alternative  adaptation  strategies. 
In  the  remainder  of  this  section  we  describe  the  two  models. 

3.1.  The  Resource  Usage  Model 

Background.  The  resources  available  to  the  application 
are  nodes  and  the  communication  links  between  them.  A 
node  is  characterized  by  its  speed  (MIPS  or  MFLOPS)  and 
the  size  of  the  local  memory.  Each  node  uses  a  schedul¬ 
ing  policy  able  to  guarantee  the  resource  reservations  and 
to  provide  feedback  to  the  application  on  its  actual  resource 
usage,  as  those  proposed  in  [20,  24].  A  communication  link 
provides  a  unidirectional  connection  between  two  nodes. 
It  is  characterized  by  one  or  more  protocols  (e.g.,  reliable, 
FIFO  unreliable),  with  known  available  bandwidth  and  cost 
of  I/O  operations  at  each  end-point  -  a  constant  per-message 
overhead  and  a  per-byte  overhead.  For  simplicity,  the  cur¬ 
rent  RUM  is  based  on  uniprocessor  nodes.  Shared-memory 
multi-processors  are  modeled  as  sets  of  nodes,  with  equally 
distributed  memory  resources  and  connected  by  very  high¬ 
speed  communication  links. 

Model  Formulation.  The  RUM  describes  the  resource 
needs  for  each  pair  of  application  component  and  event 
stream.  In  the  followings  such  a  pair  will  called  "a  compo¬ 
nent”. 

Each  component  is  described  as  an  internally  parallel 
task,  with  multiple  cooperating  modules  that  are  indepen¬ 
dent  from  the  point  of  view  of  resource  allocation.  The 
component’s  resource  needs  are  described  by  two  models  - 
static  RUM  and  dynamic  RUM.  The  static  RUM  describes 
the  expected  computation  and  communication  needs  of  the 
component,  while  the  dynamic  RUM  captures  the  runtime 
variation  of  the  component’s  needs  with  respect  to  the  static 
RUM. 

The  parameters  of  the  static  RUM  are  the  following: 

•  parallelism  level; 

•  execution  time; 

•  intra-communication  protocol; 

•  intra-communication  maximum  message  size  sent; 

•  intra-communication  total  size  sent; 

•  total  number  of  intra-communication  messages  sent; 


•  inter-communication  protocol; 

•  inter-communication  size  sent; 

•  total  number  of  inter-communication  messages  sent, 

•  processor  speed  factor. 

The  inter- communication  related  parameters  are  defined 
separately  for  each  component  following  in  the  event  s  com¬ 
munication  graph. 

The  static  RUM  is  specified  by  the  application  as  part  of 
an  explicit  request  for  resources.  Its  parameters  can  be  es¬ 
timated  using  traditional  approaches  like  algorithm  analysis 
or  code  profiling.  The  processor  speed  factor  describes  the 
performance  of  the  node  used  for  profiling. 

Each  parameter  of  the  static  RUM  is  assumed  to  be 
the  largest  value  over  the  corresponding  parameters  of  all 
the  component’s  modules.  This  is  equivalent  to  assuming 
that  all  modules  have  identical  resource  needs,  the  intra¬ 
component  communication  between  any  pair  of  modules  is 
identical,  and  a  module’s  incoming  communication  is  the 
sum  of  all  messages  sent  by  all  the  other  modules. 

These  assumptions  keep  the  model  safe  and  simple. 
However,  the  static  RUM  can  be  easily  extended  to  describe 
the  needs  of  each  module  of  the  application  component.  It 
can  also  be  extended  to  include  additional  resource  types  as 
memory. 

The  dynamic  RUM  refers  to  those  parameters  of  the  static 
RUM  that  are  likely  to  vary  at  runtime  due  to  unexpected 
changes  in  input  data  content.  The  model  is  described  by. 

•  execution  factor; 

•  intra-component  total  size  factor; 

•  intra-component  maxim  message  size  factor; 

•  inter-component  total  size  factor. 

Each  factor  represents  the  ratio  between  the  maximum 
monitored  performance  of  the  corresponding  metric  over 
an  application  specific  time  interval  and  the  static  RUM 
specifications.  The  dynamic  RUM  is  maintained  by  the 
ARA  controller  based  on  monitoring  data  received  from  the 
application. 

Model  Discussion.  Given  the  static  RUM,  the  ARA  con¬ 
troller  can  obtain  a  good  estimate  of  the  component  s  com¬ 
putation  and  communication  needs  and  use  this  informa¬ 
tion  together  with  information  on  the  event  s  input  pat¬ 
tern  and  on  the  component  deadline,  to  make  per-resource 
schedulability  analysis  and  reservations.  The  computation 
needs  include  the  execution  time  and  the  computation  re¬ 
lated  to  performing  the  communication.  The  latter  is  esti¬ 
mated  based  on  the  number  of  I/O  operations  and  the  total 
amount  transferred.  The  communication  needs  result  di¬ 
rect  from  the  model.  Different  from  the  typical  real-time 
connection  model  [2],  the  static  RUM  does  not  model  the 
intra-communication  burst  (the  inter-communication  being 
assumed  bursty).  This  parameter  is  only  related  to  the  mem¬ 
ory  needs  on  the  nodes  and  in  the  network.  We  ignore  it 
because  it  can  be  substituted  -  for  the  node,  by  adding  a 
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memory  parameter  to  the  static  RUM,  and  -  for  the  network, 
by  specifying  a  ’maximum  message  size’  large  enough  to 
cover  the  maximum  burst. 

The  dynamic  RUM  permits  the  ARA  controller  to  make 
appropriate  automatic  adjustments  even  when  the  observed 
resource  needs  are  larger  than  the  application  specifications. 
Such  a  situation  may  appear  when  the  static  RUM  does  not 
describe  the  worst-case  needs,  either  because  it  was  not  pos¬ 
sible  to  estimate  them  accurately  or  because  the  programmer 
decided  so,  possibly  driven  by  the  very  small  likelihood  of 
situations  where  the  needs  get  close  to  the  worst-case  limit. 

The  information  needed  to  maintain  the  dynamic  RUM 
could  be  obtained  with  low  monitoring  overhead  from  the 
instrumentation  of  the  communication  library. 

Related  Work.  The  resource  usage  model  introduced  here 
improves  upon  the  deficiencies  of  real-time  task  models  used 
in  previous  research  [37,  10,  11,  16,  21]  that  do  not  permit 
for  a  low-complexity  description  of  a  parallel  component. 
According  to  such  models,  a  parallel  application  component 
should  be  described  by  a  set  of  tasks  with  precedence  con¬ 
straints,  each  with  fixed  computation  and  communication 
needs,  and  with  the  I/O  operations  occurring  only  at  the 
beginning  and  the  end  of  a  task  (or  event)  execution.  This 
would  require  each  parallel  component  to  be  decomposed 
into  multiple,  small  granularity  tasks.  If  feasible,  such  com¬ 
plex  decomposition  would  significantly  increase  the  ARA 
decision  overhead  Despite  its  reduced  level  of  detail  that 
keeps  the  model  simple  and  the  decision  overhead  low,  the 
RUM  permits  good  estimates  of  the  task  performance. 

The  RUM  also  improves  on  previous  parallel  task  mod¬ 
els  used  in  load  balancing  or  task  assignment  problems 
[12,  17,  6,  27,  28,  29,  36]  that  do  not  describe  the  intra¬ 
communication  needs.  By  considering  these  needs,  the 
RUM  enables  a  better  resource  management  and  a  better 
estimation  of  the  communication  effects  on  the  application 
performance. 

3.2.  Adaptation  Model 

Background.  Each  adaptive  application  component  has 
several  acceptable  configurations.  In  general,  the  overhead 
of  instantiating  a  new  configuration  has  an  application- 
independent  and  an  application-dependent  part.  The 
application-independent  overheads  include  the  start-up  of 
a  new  parallel  module  and  the  resource  reservations  (on  the 
host  and  in  the  network).  The  application-dependent  over¬ 
heads,  called  here  adaptation  overheads ,  are  determined  by 
the  component-specific  reconfiguration  procedures.  We  as¬ 
sume  these  are  primarily  determined  by  state  transfers  and 
initializations,  and  are  significant  when  switching  between 
configurations  with  different  level  of  parallelism.  We  also 
assume  that  the  ARA  controller  can  evaluate  the  application- 
independent  overheads. 


Model  Formulation.  ♦  The  adaptation  model  describes  the 
acceptable  configurations  and  the  corresponding  adaptation 
overheads  for  each  pair  of  application  component  and  event 
stream. 

An  acceptable  configuration  is  described  by:  (1)  config¬ 
uration  id,  used  by  the  ARA  controller  to  notify  the  appli¬ 
cation  about  the  changes  in  its  resource  allocation;  (2)  static 
RUM ,  specifies  the  resource  needs  as  described  in  Sec¬ 
tion  3.1 ;  (3)  adaptation  overheads,  described  separately  for 
module  start-up  and  shut-down.  The  adaptation  overheads 
are  described  by:  the  amount  of  state  to  be  transferred,  and 
the  execution  time  of  the  corresponding  procedures  (exclud¬ 
ing  communication). 

The  adaptation  model  is  specified  by  the  application  upon 
an  explicit  request  for  resources.  For  each  application  com¬ 
ponent  several  acceptable  configurations  may  be  described. 
The  ARA  assumes  that  the  static  RUMs  for  all  configura¬ 
tions  in  an  adaptation  model  are  compatible,  in  the  sense  of 
describing  the  requirements  of  solving  the  same  problem  in 
different  configurations. 

Model  Discussion.  The  set  of  acceptable  configurations 
permits  automatic  adjustments  of  a  component  resource  us¬ 
age  without  negotiation.  The  adaptation  overhead  permits 
the  ARA  infrastructure  to  estimate  and  control  the  enact¬ 
ment  overheads,  which  can  affect  the  short-term  application 
performance. 

Related  Work.  The  inclusion  of  the  adaptation  overhead 
in  the  description  of  an  acceptable  configuration  makes  our 
model  different  from  other  schemes  that  allow  the  appli¬ 
cation  to  specify  a  set  of  acceptable  configurations  [1]  at 
resource  request  time. 

Our  current  model  does  not  allow  to  specify  the  "value" 
each  particular  configuration  brings  to  the  application  as  in 
[1].  This  is  motivated  by  the  current  goal  of  our  ARA:  sat¬ 
isfy  the  application’s  performance  requirements  and  with  no 
concern  for  the  overall  "value"  of  the  application.  Anyway, 
our  adaptation  model  can  be  easily  extended  to  include  a 
value  parameter  as  well. 

3.3.  Using  the  Models 

We  briefly  describe  how  the  RUM  and  the  adaptation 
model  are  used  by  the  ARA  infrastructure.  Details  can  be 
found  in  [33]. 

The  application  requests  an  initial  resource  allocation 
by  specifying  an  adaptation  model.  Based  on  the  current 
resource  availability,  the  ARA  controller  chooses  an  accept¬ 
able  configuration,  performs  the  corresponding  reservations 
and  notifies  the  application. 

At  runtime,  each  component  is  described  by  a  current 
RUM.  The  static  RUM  corresponds  to  the  acceptable  con¬ 
figuration  selected  by  the  last  allocation  decision.  The  dy¬ 
namic  RUM  is  maintained  based  on  the  current  static  RUM 
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and  monitoring  information. 

When  the  likelihood  of  failing  to  satisfy  the  application’s 
performance  requirements  increases  above  an  application 
specific  acceptable  threshold,  the  ARA  controller  can  de¬ 
cide  an  automatic  adjustment  of  the  application’s  resource 
allocation. 

During  the  ARA  decision,  for  each  component,  the  static 
RUMs  of  its  acceptable  configuration  are  scaled  by  the  cor¬ 
responding  dynamic  RUM  parameters.  If  for  some  compo¬ 
nent  the  current  usage  is  larger  than  the  current  static  RUM, 
the  scaled  static  RUMs  will  describe  needs  larger  than  the 
initial  specifications,  prone  to  fit  better  the  new  application 
behavior.  On  the  other  hand,  if  the  current  needs  of  some 
component  are  lower  than  the  specifications,  the  scaled  static 
RUMs  will  describe  smaller  needs,  enabling  the  ARA  con¬ 
troller  evaluate  the  unused  resources  and  to  take  advantage 
of  them  in  providing  other  components/applications  with 
better  service. 

4.  ARA  Performance  Characterization 

The  second  contribution  of  our  research  is  the  proposal 
of  a  satisfiability-driven  approach  to  evaluating  the  perfor¬ 
mance  of  an  ARA  infrastructure,  different  from  the  typical 
optimality-driven  approach.  In  the  context  of  a  real-time  ap¬ 
plication,  we  claim  that  the  ARA  infrastructure’s  reactivity 
is  often  more  important  than  the  optimality  of  its  decisions. 
In  addition,  each  ARA  decision  instance  is  equally  important 
to  the  application,  therefore  we  do  not  consider  appropriate 
to  measure  the  performance  by  averages  over  a  large  set  of 
instances. 

Our  experiments  show  that  delays  in  adjusting  the  re¬ 
source  allocation  to  changes  in  the  application  behavior  in¬ 
crease  the  delays  to  reaching  a  safe  steady  state.  Thereby, 
resource  allocation  characterized  by  large  decision  and  en¬ 
actment  overheads,  as  an  optimal  decision  is  very  likely  to 
senerate,  increases  the  likelihood  of  failing  to  satisfy  the 
application  s  timing  constraints.  For  instance,  in  a  hetero¬ 
geneous  distributed  system,  an  optimal  minimization  of  the 
end-to-end  latency  may  require  migrating  all  or  many  of  the 
application  components  to  more  appropriate  nodes.  Such  a 
reallocation  decision  may  not  be  appropriate  if  during  the 
enactment  more  events  than  acceptable  miss  their  deadlines. 

Focusing  on  the  satisfiability  of  the  application’s  perfor¬ 
mance  requirements,  we  evaluate  the  performance  of  the 
ARA  infrastructure  by  its  response  to  a  single  variation  in 
the  application  behavior  that  increases  the  risk  of  violat¬ 
ing  the  performance  requirements,  called  critical  variation. 
Specifically,  we  consider  the  following  metrics  (see  Fig¬ 
ure  3): 

•  reaction  time  -  the  period  between  the  occurrence  of 
the  critical  variation  and  the  completion  of  the  cor¬ 
recting  reallocation  enactment; 


Figure  3.  Performance  Metrics  for  the  evalua¬ 
tion  of  an  automatic  ARA  decision 


•  recovery  time  -  the  interval  between  the  enactment 
completion  and  the  restoration  of  an  acceptable  per¬ 
formance  level; 

•  performance  laxity  -  the  difference  between  the  re¬ 
quired  performance,  and  the  steady  state  performance 
after  reallocation; 

A  good  ARA  controller  is  expected  to  have  a  low  reaction 
time,  low  recovery  time  and  large  performance  laxity. 

These  metrics  reflect  the  effect  of  ARA  mechanisms  on 
the  application’s  performance  constraints  satisfiability:  re¬ 
covery  time  and  performance  laxity  relate  to  the  quality 
of  ARA  reallocation  decision,  while  reaction  time  relates 
to  the  overall  ARA  mechanisms:  detection,  decision,  and 
enactment. 

None  of  the  above  metrics  can  completely  describe  the 
ARA  controller’s  performance.  Specifically,  performance 
laxity  cannot  measure  the  transitory  effects  of  reallocation, 
while  reaction  time  and  recovery  time  do  not  reflect  steady 
state  improvement.  Moreover,  trade-offs  exist  between  fo¬ 
cusing  on  performance  laxity  vs.  reaction  time.  Optimal 
performance  laxity  may  result  in  reaction  times  that  exceed 
acceptable  delays  due  to  high  decision  or  enactment  over¬ 
heads. 

When  interested  in  characterizing  the  whole  controller's 
performance,  not  only  a  single  instance  of  critical  variation, 
the  reaction  time  and  recovery  time  can  be  estimated  by  their 
maximums  and  the  performance  laxity  by  its  minimum  over 
all  instances  of  critical  variations. 

The  proposed  metrics  set  is  relevant  for  a  real-time  ap¬ 
plication.  Poor  reaction  time  and  recovery  time  increase  the 
time  interval  during  which  the  application’s  performance 
constraints  are  not  satisfied.  Poor  performance  laxity  in¬ 
creases  the  risk  of  failing  to  satisfy  the  constraints.  Next 
section  will  demonstrate  by  experiments  the  relevance  of 
reaction  time. 

Another  interesting  issue  about  the  ARA  infrastructure  s 
performance  is  the  necessity  of  automatic  adjustments.  The 
perturbation  induced  on  the  application  by  a  not-necessary 
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adjustment  increases  the  risk  failing  to  meet  the  constraints. 
Unfortunately,  many  times  to  assess  the  necessity  of  an  ad¬ 
justment  requires  knowledge  on  the  future  evolution  of  the 
system,  which  typically  is  not  available.  For  instance  a  sin¬ 
gular  spike  in  CPU  needs  should  not  trigger  an  increase  of 
the  resource  allocation  for  the  corresponding  component. 
We  do  not  include  this  metric  in  our  set,  but  we  consider  it 
when  designing  the  mechanisms  of  ARA  infrastructure.  The 
necessity  metric  is  related  to  detection  and  state  assessment 
mechanisms  (see  Section  2). 

Related  Work.  Previous  studies  considering  automatic  ad¬ 
justments  for  ARA  of  real-time  applications  [17,  18]  typi¬ 
cally  compare  the  performance  attained  with  ARA  against 
optimal  solutions:  [18]  considers  the  performance  loss  with 
respect  to  an  ideal  ARA  mechanism  with  instantaneous  de¬ 
tection,  optimal  decision,  and  no  overheads,  while  [17]  fo¬ 
cuses  only  on  the  optimality  of  the  allocation  decision.  In 
contrast,  we  submit  that  the  optimality  of  dynamic  resource 
management  is  less  important  than  the  fact  that  an  applica¬ 
tion’s  timing  constraints  are  better  satisfied. 

5.  Factors  for  ARA  Reaction  Time 

In  this  section  we  consider  the  detection  and  the  real- 
location  decision  mechanisms  and  show  how  their  design 
can  affect  the  reactivity  of  the  ARA  controller,  and  con¬ 
sequently,  the  satisfiability  of  an  application’s  performance 
requirements.  Previous  ARA  related  studies  either  con¬ 
sidered  application  specific  detection  as  a  black-box[27],  or 
performed  detection  mechanisms  periodically  at  application 
independent  intervals[17,  32].  Our  experiments  show  that 
the  ARA  infrastructure  performance  is  improved  if  the  ap¬ 
plication  characteristics  and  the  current  state  are  considered 
when  choosing  the  methods  for  detection  and  allocation  de¬ 
cision.  In  addition,  our  experiments  show  that  the  reaction 
time,  component  of  satisfiability-driven  metric  set  proposed 
in  Section  4  is  a  relevant  performance  metric:  the  better 
the  ARA  infrastructure  reaction  time,  the  better  application 
performance. 

The  experimental  results  reported  in  this  study  are  ob¬ 
tained  with  a  synthetic,  distributed  application  designed  by 
Honeywell  in  the  context  of  high  performance  C?7  applica¬ 
tions  [25].  The  application  performs  on  a  cluster  of  eleven 
UltraSPARC-1  Model  170  workstations  with  an  MPI-1  inter¬ 
face  over  100Mbit  switched  Ethernet  links.  The  application 
consists  of  multiple  communicating  components  connected 
by  an  acyclic  graph  of  communication  links.  Each  com¬ 
ponent  can  adapt  its  execution  to  span  over  any  number  of 
processors.  Each  component  module  executes  the  following 
steps:  (1)  receive  a  message  from  each  of  the  modules  of 
the  predecessor  components,  (2)  execute  according  to  the 
computation  and  intra-component  communication  pattern 
specific  to  its  component,  (3)  send  a  message  to  each  of  the 


modules  of  the  successor  components. 


Figure  4.  Configuration  of  Synthetic 
Application:  6-stage  pipeline 

In  the  following  experiments  the  synthetic  application  has 
a  pipeline  configuration  (see  Figure  4).  All  events  have  the 
same  type.  They  are  periodically  produced  by  the  Source , 
consumed  by  the  Sink  and  processed  by  the  intermediate 
components.  For  each  component,  the  step  (2)  mentioned 
above  consists  of:  (2.1)  exchanging  a  message  with  all  of 
the  modules  in  the  same  component;  (2.2)  computating  for 
an  amount  of  time  that  depends  on:  the  parallelism  level 
of  the  component  and  corresponding  speedup  coefficient; 
(2.3)  exchanging  messages  as  in  (2.1).  A  stochastic  model 
is  used  to  emulate  a  step-like  data-dependent  variation  of 
computation  and  communication  needs. 

Enactment  is  performed  on  event  boundaries.  The  mo¬ 
ment  of  performing  the  enactment  (i.e.,  the  id  of  the  event  be¬ 
fore  whose  processing  the  resource  exchange  is  performed) 
is  determined  by  event  currently  processed  by  the  clos¬ 
est  predecessor  of  all  of  the  components  participating  in 
the  resource  exchange.  This  method  minimizes  the  en¬ 
actment  overhead  because  it  requires  no  synchronization 
among  donors,  receivers  and  the  components  with  which 
they  communicate. 

The  adaptation  overhead  is  small  and  identical  for  all 
components.  In  consequence,  we  do  not  consider  it  in  re¬ 
allocation  decisions,  but  we  do  consider  the  application- 
independent  resource  reallocation  overhead. 

In  the  followings,  "acceptable  limit"  for  a  particular  per¬ 
formance  metric  is  the  upper  bound  derived  from  a  corre¬ 
sponding  performance  constraint.  In  all  the  experiments  the 
acceptable  miss  burst  is  one. 

5.1.  Detection 

In  this  section  we  address  the  effect  of  early  detection. 
The  performance  of  a  detection  method  is  evaluate  by: 
promptness  -  how  soon  after  its  occurrence,  the  critical  vari¬ 
ation  is  signaled;  trustworthiness  -  what  ratio  of  signaled 
variations  is  critical.  The  prompter  the  detector  the  earlier 
the  detection  is,  and  in  consequence,  the  lower  the  ARA 
infrastructure’s  reaction  time  is.  Detector  trustworthiness  is 
related  to  the  necessity  of  the  reallocation  actions:  the  trust- 
worthier  the  detector,  the  less  risk  to  make  a  not-necessary 
adjustment.  In  the  following  experiments,  any  detection 
signal  is  triggering  an  automatic  adjustment. 

Promptness  is  more  important  than  trustworthiness 
when  the  timing  constraints  are  being  violated.  Figure  5 
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Figure  5.  Promptness  vs.  trustworthiness: 
threshold-driven  vs.  variation-driven  detector 

presents  the  influence  of  detection  promptness  on  the  end- 
to-end  latency  variation  when  the  execution  time  of  the  bot¬ 
tleneck  component  is  critical  (i.e.,  very  close  to  its  events 
inter-arrival  intervals).  We  experiment  with  two  detectors: 
a  threshold-driven  detector,  which  checks  the  sample  value 
against  the  acceptable  limit,  and  a  variation-driven  detector, 
which  is  similar  to  the  Sobel  detector  used  for  edge-detection 
in  computer  vision  [8.30].  Among  the  two,  the  edge  detector 
is  prone  to  be  trustworthier:  it  uses  a  range  of  points  before 
and  after  the  point  if  interest,  uses  smoothing  techniques  to 
eliminate  the  effects  of  noise.  Unfortunately,  these  tech¬ 
niques  result  in  a  poor  promptness.  The  threshold-driven  is 
likely  to  be  untrustworthy  because  it  is  sensitive  to  noise,  but 
it  is  definitely  prompt.  The  impact  of  a  prompter  detector 
is  demonstrated  in  Figure  5  which  shows  (e.g.,  see  Event 
ID  80)  that  the  number  of  events  failing  the  end-to-end  la¬ 
tency  constraint  can  be  much  larger  with  the  variation-driven 
detector  (smoothing  size  is  5,  and  sample-range  size  is  1 1 ). 

On  the  other  hand,  a  trustworthy  detector  can  be  used 
to  detect  changes  in  the  application  behavior  which  do  not 
immediately  cause  the  performance  constraints  to  be  vio¬ 
lated.  but  which  increase  the  risk  of  such  a  situation.  In 
our  experiment,  a  change  in  execution  time  which  caused 
the  end-to-end  latency  to  get  within  10%  of  the  acceptable 
limit  (see  Event  ID  25),  is  signaled  by  the  variation-driven 
detector  and  triggers  a  reallocation  which  reduces  the  la¬ 
tency  to  more  than  15%  below  the  acceptable  threshold.  A 
threshold-driven  can  not  be  used  for  detecting  changes  that 
are  not  critical  but  increase  the  risk  of  failing  to  satisfy  the 
performance  constraints  because  of  its  sensitivity  to  spikes. 

Promptness  can  be  affected  by  method  used  to  evalu¬ 
ate  the  metric  of  interest.  Consider  a  performance  metric 
that  can  be  evaluated  either  directly  or  by  composing  several 
independent  metrics.  For  instance,  end-to-end  latency  can 
be  measured  directly  or  as  it  can  be  evaluated  by  the  sum  of 
execution  and  communication  overheads  of  each  application 


Figure  6.  Effects  on  latency:  component- 
based  detector  vs.  direct  detector 


component  on  the  event  path.  Consequently,  for  detection, 
one  can  take  a  direct  approach  by  using  the  metric  itself  (e.g., 
the  observed  latency),  or  a  component-based  approach  by 
using  the  component  metrics  (e.g.,  the  observed  execution 
of  each  application  component  on  the  event  path). 

The  component-based  approach  is  prompter  than  the  di¬ 
rect  approach  and  such  improved  performance  results  in 
shorter  reaction  times  (see  Figure  6,  where  component  A  s 
execution  time  increases).  In  particular,  the  difference  is 
significant  when  the  event  path  is  long  (in  terms  of  latency) 
and  the  critical  variation  occurs  early  on  the  path. 

5.2.  Reallocation  Decision 

In  this  section  we  address  the  effects  of  considering  en¬ 
actment  overheads  and  of  using  state-specific  incremental 
heuristics  for  deciding  automatic  adaptations  and  corre¬ 
sponding  resource  allocations. 


Effects  ot  enactment  overriead 


Figure  7.  Influence  of  enactment  over¬ 
head  on  reallocation  results 
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Reallocation  heuristics  that  are  aware  of  the  enact¬ 
ment  overheads  result  in  improved  performance.  Low 

enactment  overhead  improves  reaction  time,  and  reduces 
the  risks  of  failing  to  meet  application’s  timing  constraints 
during  the  enactment  period.  Figure  7  shows  the  end-to- 
end  latency  variation  with  two  decision  heuristics  distinct  in 
their  awareness  of  enactment  overhead:  First,  the  ‘single¬ 
pair’  heuristic  (SPH)  tries  to  accommodate  a  critical  vari¬ 
ation  with  a  two-component  transaction,  which  is  likely  to 
result  in  lower  enactment  overhead  than  transactions  involv¬ 
ing  more  components.  Second,  the  ‘fair-decrease’  heuristic 
(FDH)  tries  to  be  fair  about  reducing  the  number  of  pro¬ 
cessors  available  to  different  application  components  but  it 
ignores  the  enactment  overhead. 

To  accommodate  a  step  increase  of  component  A’s  com¬ 
putation  needs  (see  Figure  4),  the  SPH  decides  a  2-node 
transfer  from  component  C  to  A,  while  the  FDH  decides  a 
1-node  transfer  from  each  of  the  components  C  and  D  to 
B.  Both  heuristics  lead  to  similar  steady  state  performance. 
However,  the  enactment  overhead  with  FDH  (23  msecs)  is 
larger  than  with  SPH  ( 1 8  msecs).  Thus,  the  number  of  events 
failing  their  latency  requirements  with  FDH  is  larger. 

Application-state  driven  incremental  decisions  can  re¬ 
duce  the  reaction  time.  An  incremental  decision  can  take 
advantage  of  the  current  system  state  in  determining  which 
components  must  receive  or  are  allowed  to  donate  resources. 
Such  decisions  usually  provide  more  rapid  response  to  per¬ 
formance  perturbations  than  decisions  which  are  computed 
using  no  history  information  [6,  17]. 

Simple  incremental  heuristics,  such  as  determining  a  re¬ 
ceiver  and  then  searching  for  an  appropriate  donor,  can  give 
acceptable  results  with  low  decision  overhead  depending 
upon  the  order  in  which  the  components  are  checked.  How¬ 
ever,  the  effectiveness  of  an  ordering  criterion  varies  with 
the  system  state.  We  experiment  with  two  ordering  criteria: 
(1)  by  actual  execution  time ,  AE,  and  (2)  by  execution  time 
variation  with  reallocation,  EV. 

In  a  rate-critical  state,  the  primary  goal  of  reallocation  is 
to  reduce  the  maximum  execution  time  in  the  system.  Thus, 
the  bottleneck  component  needs  to  receive  resources,  and 
these  resources  can  be  taken  from  any  other  component, 
provided  the  resulting  execution  time  does  not  violate  the 
acceptable  rate  requirement.  AE  helps  to  focus  immediately 
on  the  highest  and  lowest  execution  time  components,  while 
EV  may  search  longer  as  it  is  very  likely  that  the  bottleneck 
will  not  realize  the  best  improvement.  In  our  experiment, 
AE  order  produces  an  acceptable  reallocation  after  one  try 
(1.34  msecs),  while  EV  takes  4  tries  (1.58  msecs).  Note 
that  in  this  experiment,  a  configuration  analysis  takes  only 
0.080  msecs.  We  expect  this  overhead  to  be  larger  for  more 
complex  application  structures,  when  more  complex  timing 
requirements  than  end-to-end  latency  and  maximum  achiev¬ 
able  event  rate  are  considered. 


In  a  latency-critical  situation,  the  goal  is  to  improve  the 
sum  of  the  execution  times  of  all  of  the  components  on  the 
critical  path.  The  best  solution  with  a  two-pair  transaction 
is  to  give  resources  to  the  component  expected  to  have  the 
largest  reduction  in  execution  time,  and  to  take  these  re¬ 
sources  from  the  component  expected  to  have  the  lowest 
increase  in  execution  time.  By  following  this  rule,  the  EV 
heuristic  finds  the  best  transaction  after  one  try  (2.56  msecs), 
while  the  AE  takes  4  tries  (2.83  msecs). 


6.  Contributions  and  Future  Work 


This  paper  considers  the  problem  of  ARA  for  high- 
performance  real-time  applications  executing  in  dynamic 
environments.  Applications  consist  of  multiple  parallel 
tasks  with  data-dependent  resource  needs.  Our  contribu¬ 
tions  are: 

•  present  experimental  results  that  demonstrate  the  im¬ 
portance  of  focusing  on  the  response  time  of  the  re¬ 
source  allocation  mechanisms  rather  than  the  optimal¬ 
ity  of  their  decisions,  when  real-time  constraints  must 
be  satisfied. 

•  define  an  application  resource  usage  model  that  per¬ 
mits  to  describe  parallel  real-time  tasks  and  enables 
good  reallocation  decisions  even  when  the  observed 
performance  is  larger  than  the  specified  values. 

•  define  an  adaptation  model  that  makes  possible  auto¬ 
matic  ARA  decisions  and  permits  to  evaluate  the  im¬ 
pact  of  enacting  these  decisions  on  the  application’s 
timing  constraints. 

•  define  a  novel  set  of  performance  metrics  to  evaluate 
ARA  performance  by  focusing  on  the  satisfiability  of 
the  application’s  timing  constraints.  These  metrics 
are  reaction  time,  recovery  time,  performance  laxity. 

•  identify  factors  related  to  detection  and  decision  tech¬ 
niques  which  can  influence  the  degree  to  which  an 
application  meets  its  real-time  constraints.  These  fac¬ 
tors  are:  early  detection,  enactment  overhead,  state- 
specific  incremental  decision  heuristics. 

The  models  and  heuristics  presented  in  this  paper  are 
shown  useful  in  the  context  of  processor  reallocation  for  an 
adaptive,  synthetic  applications  designed  to  represent  time- 
critical  applications  in  C3I  systems.  In  the  future,  we  plan 
to  apply  them  to  other  types  of  adaptive  applications  as  a 
complex,  distributed  computer  vision  application.  We  also 
plan  to  integrate  the  insights  and  mechanisms  presented  here 
into  a  broader  framework  for  resource  management  destined 
for  systems  where  multiple  real-time  applications  coexist, 
and  where  the  ARA  mechanisms  described  in  this  paper  are 
used  in  conjunction  with  online  negotiation  mechanisms. 
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Abstract 

This  paper  addresses  the  problem  of  effective,  on-line  adaptive  resource  manage¬ 
ment  in  parallel/distributed  architectures.  The  class  of  applications  is  very  data- 
dependent,  resulting  in  highly  dynamic  demands  for  available  resources.  Adaptive 
resource  management  is  an  alternative  to  engineering  real-time  systems  toward  wors  - 
case  application  behaviors.  To  meet  performance  constraints,  the  system  must  react 
swiftly  to  run-time  load  variations  and  accurately  redistribute  resources  in  real-time. 
We  propose  decision  models  that  operate  on  dynamically  monitored  performance  data 
to  determine  when  resource  reallocation  is  necessary.  The  proposed  decision  models 
operate  at  two  levels.  The  first  is  a  low-level  approach  involving  a  Bayesian  probabilis¬ 
tic  decision  model.  The  second  is  a  high-level  approach  based  upon  state  transitions 
and  a  Markovian  decision  model.  The  framework  for  our  evaluation  is  a  synthetic  en¬ 
vironment  capable  of  simulating  event  driven,  multitask  applications  where  each  task 
is  partitioned  into  subtasks  executing  on  individual  processors. 


1  Introduction 

This  research  addresses  a  class  of  parallel  applications  that  can  be  modeled  as  a  collection 
of  miSe  precedence-constrained  data-parallel  tasks  or  stages.  We  encounter  such  classes 
SttSendent  applications  that  are  very  sensitive  to  run-time  changes  m 
event  rates  and  input  data  content  [4].  Consequently,  execution  is  heavily  data  dependent 
and  imposes  highly  dvnamic  resource  demands  upon  the  host  system.  A  primary  examp 
of  relevant  applications  are  real-time  defense  systems  that  must  constantly  react  to  changes 
an  external  phvsical  environment.  The  environmental  changes  result  in  highly  data- 
denendent  processing  loads.  For  example,  Automatic  Target  Recognition  (ATR)  systems 
experience  S  varying  processing  loads  as  a  result  of  their  heavy  dependence  on  scene 
j  aicrnrithm  parameters  [ll  As  the  distances  to  targets  of  interest  fluctuate,  the  number  o 
regions  of  inSrS  to  precis' changes,  resulting  in  significant  variation  of  the  computat.onal 
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load.  Because  of  their  data-dependent  nature,  the  resource  requirements  of  the  parallel  tasks 
will  vary  significantly  during  run  time. 

A  possible  solution  to  meeting  the  performance  requirements  of  such  applications  is  to 
statically  assign  enough  resources  to  accommodate  the  worst  case  application  behavior.  This 
solution  is  often  infeasible  because  of  the  excessive  level  of  resources  required  to  address  all 
possible  situations  [4].  The  alternative  is  adaptive  resource  management  to  re-allocate  lim¬ 
ited  resources  dynamically  in  response  to  an  application’s  needs.  In  real-time  environments, 
efficient,  low  latency  reallocation  is  crucial  to  the  ability  of  such  applications  to  meet  dead¬ 
lines.  A  critical  component  of  this  reallocation  process  is  the  decision  model  that  determines 
when  a  reallocation  of  resources  is  necessary. 

Our  work  is  based  on  the  operational  model  depicted  in  Figure  1.  Applications  are 
modeled  as  an  acyclic  graph  of  data-parallel  tasks.  Data  frames  are  pipelined  through 
this  graph  and  each  of  these  data-parallel  tasks  can  be  further  structured  as  a  collection 
of  subtasks,  each  running  on  an  individual  processor.  The  number  of  subtasks  within  a 
task  varies  as  processors  are  dynamically  allocated  to  and  deallocated  from  the  original 
task.  The  subtasks  are  instrumented  to  provide  performance  measurements  in  real-time. 
These  instrumented  streams  of  data  are  processed  by  detectors  that  produce  detection  events 
signaling  major  changes  in  performance  metrics.  Decision  models  process  these  streams  of 
detection  events  to  determine  if  resource  reallocation  is  necessary,  and  if  so,  to  initiate 
procedures  for  the  computation  and  enactment  of  new  reallocations.  In  this  paper  we  only 
address  the  reallocation  of  processors  among  tasks  to  maintain  a  minimal  frame  latency 
through  the  task  graph. 

The  majority  of  existing  research  on  resource  allocation  and  reallocation  is  focused  on 
algorithms  that  determine  how  to  most  effectively  allocate  or  reallocate  resources.  There 
is  an  extensive  literature  on  dynamic  resource  allocation,  typically  in  the  context  of  load 
balancing  algorithms  (for  example  see  [8,  15,  21,  12,  18,  9]).  Strategies  typically  focus  on 
where  tasks  must  be  scheduled  as  function  of  available  resources.  More  recent  research 
has  studied  dynamic  processor  scheduling  algorithms  in  multiprocessor  svstems[14,  13]  and 
even  algorithms  for  dynamic  control  of  communication  resources[16]  in  parallel/distributed 
applications.  These  resource  allocation  algorithms  rely  on  the  existence  of  a  mechanism 
that  determines  when  they  are  invoked,  for  example,  at  task  arrival  time.  This  does  not 
permit  reaction  to  run-time  load  variations  within  the  application.  We  argue  that  for  run¬ 
time  reallocation,  it  is  critical  to  be  able  to  determine  when  such  resource  reallocation 
algorithms  must  be  invoked  during  task  execution.  Accurate  timing  can  avoid  thrashing 
during  transient  workload  changes,  permit  low  latency  reallocation,  and  in  some  instances 
preempt  performance  degradation  by  predicting  reallocation  needs.  This  focus  on  decision 
models  complements  (and  is  distinguished  from)  the  recent  work  on  the  online  adaptation  of 
systems  for  real-time  applications[18,  19].  Such  frameworks  incorporate  mechanisms  for  run¬ 
time  monitoring,  adaptation  enactment,  and  processor  reallocation.  We  argue  that  effective 
decisions  models  must  be  incorporated  into  such  frameworks  if  they  are  to  be  successfully 
applied  to  online  adaptive  resource  management  functions. 

This  paper  proposes  the  effective  use  of  decision  models  for  dynamic  resource  allocation 
in  high-performance  parallel/distributed  systems.  Specifically  this  paper  proposes  a  combi¬ 
nation  of  a  low  latency  decision  model  that  is  reactive  in  nature  with  a  (relatively)  more 
complex  decision  model  that  is  predictive  in  nature.  We  show  that  such  a  model  is  quite 
insensitive  to  transient  workload  shifts  or  “spikes” ,  thereby  reducing  ineffective  reallocations. 
The  model  is  also  quit  effective  in  predicting  impending  workload  changes.  Experiments  are 
presented  that  relate  characteristic  of  the  application,  such  as  noise,  and  parameters  of  the 
decision  model.  Thus,  the  decision  model  can  be  “tuned”  based  on  some  knowledge  of  the 
application  behavior.  Using  a  synthetic  benchmark  generator,  we  experimentally  demon¬ 
strate  an  increase  in  performance  and  a  decrease  in  overhead  across  a  range  of  input  data 
parameters.  While  the  current  implementations  are  focused  on  a  class  of  computationally 
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intensive  sensor-processing  applications,  these  decision  models  are  more  generally  applicable 
t^asvnchronous ^event-driven  computational  models.  Throughout  this  paper  the  presenta- 
tk>rini  rely  on  an  automatic  target  recognition  ( ATR)  application  to  illustrate  the  behavior 
of  the  proposed  model. 


instrumented  streams 


application  layer 
resource  management  layer 


contribution 


Figure  1:  Operational  model  for  dynamic  resource  allocation 

The  remainder  of  this  paper  is  organized  as  follows.  Section  2  provides  a  description  of  our 
svstem  architecture  and  the  problem  we  are  addressing.  In  section  3,  we  provide  a  detailed 
descrStdon  of  our  proposed  decision  models.  Lastly,  section  4  presents  our  experimental 
results  and  a  review  of  the  contributions  of  this  paper. 


2  Problem  Description 

2.1  System  Overview 

We  consider  an  ATR  application  processing  a  stream  of  sensor  data  frames  The  vision 
processing  must  extract  targets  from  the  background  terrain  and 

fication  information.  The  goal  is  to  maintain  a  certain  processing  frame  rate.  As  illustrated 
in  Figure  1,  resource  reallocation  is  managed  by  a  system  consisting  of  four  major  compo- 
npntc-  dpt  option  decision,  reallocation,  and  enactment. 

Currently  monitoring  is  accomplished  by  a  real-time  mstrumentatmn  system  that  cm. 
detect  significant  changes  in  a  number  of  performance  metrics  [o]  [6].  These  monitors  pro 
duce  instrumented  streams  of  sampled  parameter  values.  Sample  parameters  include  subtask 
execution  time  subtask  communication  time,  communication  volume,  input  frame  rates,  and 
other  ^measures  of  application  performance  or  resource  utilization.  We  may  also  choose  to 
monitor  application-specific  measures  such  as  the  frequency  of  specific  message  types, .access 
natterns  tointernal  data  structures  or  any  other  measure  that  is  representative  of  the  p 
Plication’s  resource  usage.  Detectors  operate  on  these  streams  to  produce  detection  events 
co^e^Dondhflgto^potentfallv  significant  deviations  in  performance  guarantees.  Decision i  mod- 
ePs  anTlvSe  streams  of  detectioP  events  to  make  assertions  about  the  current  global  state  and 
r  •A  K/^nt  tVip  fntiirp  state  of  the  system.  If  a  decision  to  reallocate  is  made,  a  cost 
evaluator  creates  a^ew^specification  of  resource  assignments.  In  this  paper  we  only  consider 
S ?re processors  to  tasks.  Tasks  are  data  parallel  and  the  number  of  subtasks 
of  a  task  is  equal  to  the  number  of  processors  assigned  to  that  the  task.  This  ncv.  resourc 
assignment  must  then  be  enacted  and  adopted  by  the  system.  In  this  case  processors  within 
oneUsk  may  be  reallocated  to  another  task  to  maintain  the  frame  rate. 


33 


2.2  Problem  Definition 

Monitoring  and  detection  occur  constantly  as  data  frames  move  through  the  task  pipelines. 
Decision  models  must  constantly  react  to  the  detection  event  streams  being  produced  by  the 
detectors.  However,  the  cost  evaluation  and  resource  reallocation  only  occur  if  the  decision 
model  signals  the  reallocation  module.  There  is  some  computational  overhead  involved  in 
calculating  a  new  resource  mapping,  so  ideally  one  would  only  incur  this  penalty  under  the 
assurance  of  improved  overall  system  performance.  Because  of  imperfect  monitors  and  noisy 
input,  it  is  difficult  to  determine  accurately  the  optimal  times  to  initiate  a  remapping.  An 
effective  reallocation  decision  policy  must  weigh  the  costs  of  remapping  against  the  potential 
performance  benefits  [3].  In  a  worst-case  scenario,  a  reallocation  may  be  enacted  only  to 
discover  that  the  previous  conditions  were  transient  and  the  system  is  more  unbalanced  than 
before  remapping. 

There  are  two  competing  factors  governing  the  reallocation  decision  process.  The  first 
is  the  desire  for  fast  detection  and  reaction.  This  is  important  because  of  the  real-time 
requirements  of  the  majority  of  these  applications.  Long  and  complex  decision  algorithms  can 
result  in  large  decision  and  enactment  overheads.  This  overhead  can  cause  multiple  events 
to  miss  their  deadlines  because  of  quickly  changing  environmental  conditions.  In  addition, 
many  of  the  dynamic  input  conditions  are  highly  transient  and  unstable.  For  example, 
background  foliage  that  appears  only  in  a  few  successive  image  frames  can  produce  sudden, 
transient  change  in  the  processing  workload.  Therefore,  the  second  important  factor  is  the 
global  performance  of  the  application  using  the  new  resource  mapping.  Ideally,  new  resource 
mappings  will  lead  to  configurations  which  are  stable  and  improve  the  long-term  performance 
of  the  application.  While  quick  decisions  may  result  in  locally  improved  performance,  they 
are  by  nature  not  globally  cognizant,  i.e.,  will  the  change  in  terrain  features  persist  for  a 
relatively  long  period  of  time?  Thus,  it  is  possible  to  continually  make  locally  optimal  task- 
based  decisions  while  the  overall  performance  of  the  application  steadily  moves  toward  a  less 
efficient  state. 


2.3  Solution  Strategy/ Approach 

This  paper  proposes  the  use  of  a  decision  model  structured  as  two  component  models.  The 
first  is  a  Bayesian  decision  model ,  which  operates  at  the  lower-level  of  the  decision  process. 
This  probabilistic  model  acts  as  a  filter  between  the  monitoring  system  and  the  reallocation 
module  to  reduce  false  detection  and  incorrect  decisions.  This  will  increase  the  stability  of 
the  application  and  reduce  the  amount  of  unnecessary  overhead  incurred  by  reallocation  in 
response  to  false  detections.  The  second  is  a  Markovian  decision  model  which  operates  at  a 
higher  level  of  the  decision  process.  The  Markovian  model  is  designed  to  keep  track  of  global 
application  performance  by  monitoring  the  state  transitions  of  various  performance  metrics. 
Using  the  state  transition  data,  the  Markovian  model  is  able  to  predict  the  steady-state 
system  performance  and  react  to  potential  future  performance  degradations. 


3  Decision  Models 

The  simplest  decision  model  in  our  framework  (Figure  1)  has  been  optimized  for  low  la¬ 
tency  rather  than  high  accuracy  decisions.  In  this  model,  referred  to  as  the  baseline  model, 
any  detection  event  immediately  triggers  a  cost  evaluation  and  potential  reallocation.  It  is 
apparent  that  this  decision  model  can  lead  to  unnecessary  overhead,  in  part  because  the 
cost  evaluation  penalty  is  incurred  every  time  a  detection  event  occurs.  Often,  because  of 
imperfect  monitors,  a  detection  event  is  not  indicative  of  the  overall  performance  falling 
beneath  the  required  limit.  In  this  situation,  referred  to  as  false  detection,  the  overhead 
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involved  in  cost  evaluation  is  incurred  with  no  potential  for  benefit.  In  response,  we  propose 
a  two-level  decision  model  for  such  svstems.  The  first  model  is  an  adaptation  of  a  Bayesian 
decision  model  originally  formulated  by  Nicol  [2],  The  second  is  a  Markovian  decision  model 
formulated  to  predict  global  trends  in  application  performance. 


3.1  Bayesian  model 

There  are  several  sources  of  difficulty  that  can  make  the  baseline  decision  model  ineffective. 
First,  the  application  may  be  subject  to  transient  load  spikes.  Since  the  application  is  data 
dependent,  it  is  sensitive  to  any  changes  in  the  input  stream.  A  problem  occurs  with  noisy 
data  or  “spikes”  that  represent  transient  load  changes  rather  than  a  stable  shift  in  compu¬ 
tational  loads.  In  such  situations,  a  detection  event  may  not  be  indicative  of  a  longer  term 
change  in  load,  necessitating  a  resource  reallocation.  Rather,  it  may  represent  a  transient 
condition,  in  which  case  a  reallocation  may  actually  negatively  affect  performance.  In  terms 
of  our  ATR  example,  changes  in  scenery  can  result  in  input  spikes.  In  the  absence  of  new 
targets,  these  spikes  are  transient  and  should  not  be  grounds  for  reallocation.  An  effective 
resource  management  svstem  must  be  capable  of  discerning  stable  computational  shifts  from 
spikes.  Second,  the  detectors  themselves  possess  some  degree  of  unreliability.  It  is  possible 
for  detectors  both  to  generate  a  false  detection  event  or  fail  to  detect  a  genuine  detection 
event.  Increasingly  accurate  detectors  are  computationally  intensive  and  increase  the  latency 
between  the  occurrence  of  a  load  imbalance  and  corresponding  detection.  On  the  other  hand, 
simple  load  detectors  mav  not  be  effective  in  accurately  detecting  stable  load  changes.  They 
can  significantly  raise  computation  overhead  by  increasing  the  number  of  false  detections. 
The  proposed  Bayesian  decision  model  will  allow  for  quick  detection  of  critical  events  using 
simpler  detectors  while  reducing  reports  of  false  detections. 

3.1.1  Model  description 

Faced  with  computational  overhead  and  potentially  poor  performance  in  the  event  of  an 
unnecessary  reallocation,  it  is  not  beneficial  to  run  the  cost  evaluator  on  the  basis  of  a  sin¬ 
gle  positive  report.  As  illustrated  in  Figure  2,  the  Bayesian  decision  model  adds  an  extra 
component,  operating  as  a  smart  filter,  to  the  system.  In  this  proposed  configuration,  a  col¬ 
lection  of  monitors  still  record  execution  parameters  as  frames  pass  through  the  application. 
However,  in  this  scheme,  all  detection  events  pass  through  the  smart  filter.  The  smart  filter 
uses  a  Bayesian  decision  model  to  determine  if  the  cost  evaluator  should  be  invoked.  It  is 
the  goal  of  the  smart  filter  to  minimize  the  number  of  false  detections  passed  to  the  cost 
evaluator.  Ideally,  the  cost  evaluator  will  only  be  signaled  if  a  potential  remapping  benefit 
is  very  likely.  Conversely,  whenever  a  remapping  benefit  becomes  likely,  the  cost  evaluator 
should  be  signaled  as  soon  as  possible. 


4t  each  frame  time,  we  compute  the  probability  that  performance  can  be  improved  by 
reallocating  resources.  This  probability  is  referred  to  as  the  gain  probability  and  is  repre¬ 
sented  bv  the  svmbol  pn  [2].  The  Bayesian  decision  process  must  constantly  strengthen  or 
weaken  the  gain  probability  based  only  on  information  from  the  detectors  and  information 
about  the  quality  of  their  detections.  These  detectors  operate  on  the  instrumented  streams 
returned  from  the  monitors  and  look  for  various  metrics  at  the  task  level.  While  the  overall 
application  behavior  may  be  within  real-time  bounds,  the  detectors  can  notice  if  a  specific 
task’s  performance  starts  to  decrease.  By  operating  at  the  lower  level,  this  model  can  de¬ 
termine  improved  resource  mappings  even  if  the  real-time  requirements  are  not  immediately 

threatened. 
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Figure  2:  Block  diagram  of  the  Bayesian  decision  model. 


3.1.2  Model  calculation 

To  implement  this  Bayesian  model,  three  distinct  parameters  must  be  defined  [2]. 

1.6-  The  probability  that  a  performance  gain  can  be  realized  by  remapping  after  the 
current  frame,  given  that  it  has  not  been  realizable  in  the  previous  frame. 

2.  a  -  The  probability  of  the  detectors  prematurely  reporting  a  remapping  condition. 

3.  (3  -  The  probability  of  the  detectors  failing  to  report  an  existing  remapping  condition. 


The  parameters  a  and  /?  encompass  the  fact  that  the  detectors  contain  some  inherent  in¬ 
accuracy.  Based  on  the  above  parameters,  we  calculate  the  probability  pn  that  a  performance 
gain  can  be  realized  on  frame  n  based  upon  the  results  returned  by  the  detectors. 

First,  we  define  the  pretest  probability,  pa ,  that  a  performance  gain  is  achievable  on  frame 
n,  given  that  pn-\  =  P  '■ 

Pa{p )  =p+{l-p)6 

Baves’  Theorem  states  that  for  two  events  A  and  B,  the  probability  of  .4  given  B  is: 


P(A  |  B)  = 


P{B  |  A)-P(A) 

P(B  |  A)  ■  P(A)  +  P(B  |  Ac)  ■  P(AC) 


Now,  if  the  detectors  return  a  positive  remapping  report,  we  want  to  calculate  the  prob¬ 
ability  that  an  actual  performance  gain  exists.  Using  the  above  notation,  A  is  the  event 
where  a  performance  gain  exists  and  B  is  the  event  of  a  positive  report.  Therefore,  P(A)  is 
given  by  the  pretest  probability,  pa,  and  P(B  \  A)  is  given  by  the  probability  of  an  accurate 
positive  report.  Since  (5  is  the  probability  of  the  detectors  failing  to  report  a  gain  condition, 
(1  —  j3)  is  the  probability  that  the  detectors  accurately  report  a  gain  condition.  P(AC)  is  the 
complement  of  P{A),  which  is  given  by  (1  —  pa).  Lastly,  P(B  \  Ac )  is  the  probability  that  a 
positive  report  is  returned  given  that  no  remapping  gain  exists.  Recall  that  this  is  the  exact 
definition  of  the  parameter  a.  Substituting  these  expressions  into  Bayes’  Theorem  gives  us 
the  following  gain  probability  for  a  positive  report: 

_ _ (1  -  /?)  -pQ(p) _ 

Pn  (1  - /3)  ■  pa(p)  +  a  •  {1  -  pa{p)) 

Using  similar  logic,  the  gain  probability  in  the  event  of  a  negative  report  can  be  computed 
as  follows: 
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P  '  Pa  i.P) 

Vn~  P-Pa(p)  +  (1-0)-  (1-P«(P)) 

In  either  case,  this  probability,  p„,  represents  a  weighted  measure  of  the  potential  for  a 
remapping  gain  to  exist  after  frame  n.  When  pn  crosses  a  suitable  threshold  level,  it  is  deemed 
likely  that  a  remapping  gain  exists.  This  in  turn  justifies  a  cost  evaluation  to  determine  if 
a  better  resomce  atoitfon  is  realizable.  If  so,  the  reallocation  module  is  signaled  and  the 
remapping  is  enacted. 


3.2  Markovian  model 

Bavesian  decision  models  are  sensitive  to  the  accuracy  of  model  parameters  and  as  such  are 
implemented  at  a  low  level  within  the  application.  These  models  tend  to  be  reactive  as  they 
wait  for  events  signaled  bv  the  detectors.  At  a  higher  level  of  abstraction,  we  can  track  the 
application  performance  based  on  the  degree  to  which  user  specified  performance  bounds  are 
met  Toward  this  end,  we  formulate  a  Markovian  decision  model.  This  model  is  predictive 
as  it  uses  performance  state  data  to  predict  the  future  behavior  of  the  aPP^f'°n:  ^ 
of  our  ATR  application,  the  Markovian  model  analyzes  scenic  trends.  While  an  indi  i 
scene  may  not  provide  much  information,  a  collection  of  scenes  can  provide 
future  direction  and  the  resources  that  may  be  required  by  specific  tasks^  The  Marko 
model  uses  these  trends  to  predict  the  need  for  a  resource  reallocation  before  it  actuaH 

occurs. 

3.2.1  Model  description 

Bv  working  at  the  lower  level,  the  Bayesian  model  is  able  to  detect  improved  resource  m^ 
pings  even  when  the  application  is  conforming  to  the  real -time  specifications.  The  ^arko 
model,  which  operates  at  a  higher  level,  is  triggered  solely  by  the  level  of  conformity juth 
the  real-time  bounds.  It  will  only  trigger  a  remapping  if  the  application  is ^  Predicted  , 
violate  these  bounds  with  a  high  probability.  The  Markovian  decision  model  can  be  Mewed 
as  a  watchdog  for  the  Bayesian  model.  If  the  Bayesian  model  is  able  to  raai^aiJl  ^°  ' 
mance  within  the  desired  specifications,  the  Markovian  model  will  nev er  inter 
if  the  svstem  performance  appears  to  threaten  the  real-time  specifications,  the  Marko\iai 
model  will  override  the  Bayesian  model  and  force  a  resource  reallocation.  An  example  of 

thl! F^gur?3presentfadbiockeSagram  of  the  system  incorporating  the  Markovian  decision 
model  Its  primary  function  is  to  monitor  specific  evaluation  metrics  of  global  application 
performance,  e.g.  end-to-end  frame  latency.  This  is  done  by  comparing  actual  measured 
statistics  with  the  real-time  specifications.  The  ratio  of  the  measured  statistic  to  the  desire 
bound  serves  as  a  metric  of  the  level  of  conformity  of  the  application.  This  level  of  conformity 
£ Xn  used  to map  the  application  into  one  of  a  set  of  previously  defined  performance 
states.  These  performance  states  and  the  transitions  between  them  provide  the  underly  mg 
framework  of  the  Markovian  model. 

The  inherent  differences  between  the  two  decision  models  enable  them  to  be  coupled  in 
a  svnergistic  manner.  The  Bayesian  model  is  constantly  checking  the  detection  streams  and 
uDdathS  the  gain  probability  When'  the  gain  probability  exceeds  a  threshold,  it  signifies 
rhifh  potential  for  a  remapping  gain.  To  take  full  advantage  of  this  potential  and  reduce 
the  chance  of  missed  deadlines,  the  Bayesian  model  is  coupled  with  a  simple  reallocation 
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Figure  3:  Block  diagram  of  the  coupled  decision  model. 


algorithm  to  provide  a  low  latency  solution.  While  these  decisions  allow  a  quick  response 
and  often  provide  an  immediate  improvement,  they  may  not  represent  the  best  resource 
allocation.  To  correct  for  this,  the  Markovian  model  is  invoked  at  periodic  intervals.  If  the 
predicted  performance  of  the  application  is  sufficiently  poor,  a  (relatively)  more  complex 
reallocation  algorithm  can  be  invoked.  This  algorithm  performs  a  more  extensive  (and 
therefore  costlier)  assessment  of  remapping  potential.  In  these  instances,  a  high  quality 
remapping  which  can  provide  system-wide  improvement  is  more  effective  than  a  high  speed 
decision  which  can  provide  immediate  but  local  improvement. 

3.2.2  Model  calculation 

We  first  provide  an  intuition  about  the  application  of  the  model  through  the  following  ex¬ 
ample.  Consider  the  state  space  of  an  ATR  application  as  represented  in  Figure  4  where  we 
wish  to  maintain  a  frame  analysis  rate  above  33  frames/sec.  Therefore,  we  must  maintain 
end-to-end  frame  latencies  below  0.03  seconds.  The  maximum  frame  latency  we  expect  is  .10 
seconds  and  the  latency  range  is  divided  into  five  states.  States  0,  1,  and  2  are  considered 
acceptable  and  states  3  and  4  are  considered  unacceptable. 


acceptable  states  i  unacceptable  states 


.00 -.01  .01  -.02  .02 -.03  1  .03 -.04  .04 -.05 


Figure  4:  A  high  level  example  of  the  Markovian  decision  process. 


Over  time,  frames  are  periodically  generated  by  the  source  and  injected  into  the  system. 
At  specific  instances,  referred  to  as  frame  intervals ,  completed  frames  are  consumed  at  the 
sink.  The  time  between  frame  generation  and  consumption  is  the  end-to-end  frame  latency 
which  we  are  trying  to  control.  At  each  frame  interval,  the  monitored  information  streams 
can  be  used  to  generate  a  current  snapshot  of  the  application  in  terms  of  frame  latency 


38 


performance.  This  snapshot  is  used  to  categorize  the  system  as  residing  in  a  particular  per¬ 
formance  state.”  As  shown  in  Figure  4,  the  range  of  performance  states  is  easily  partitioned 
into  a  collection  of  acceptable  and  unacceptable  states^  As  the  data  content  of  the  input 
frames  varies  over  time,  the  current  performance  state  of  the  system  will  fluctuate.  Resource 
reallocation  is  used  to  improve  the  performance  of  the  system  when  it  resides  m  an  unac¬ 
ceptable  state.  The  goal  of  the  Markovian  model  is  to  predict  when  the  system  is  moving 
toward  an  unacceptable  state  and  to  avoid  it  by  triggering  a  resource  reallocation  in  advance. 
This  is  accomplished  by  recording  statistics  about  the  state  transitions  experienced  y 
application  Using  the  state  transition  information  and  Markov  theory,  we  can  predict  the 
steadv-state  behavior  of  the  application.  At  periodic  intervals,  this  steady-state  prediction 
is  used  to  determine  if  resource  reallocation  is  necessary.  This  interval  is  referred  to  as  the 

The  following  description  is  based  on  sensor  applications  where  data  frames  are  am  mg 
at  some  rate.  The  model  can  be  generalized  in  a  straightforward  manner  to  more  general 
event  driven  applications  [4]  where  the  events  such  as  a  frames  may  arrive  asynchronous!} 

rather  than  at  a  fixed  rate.  .  /  >  orij  tV>0 

Given  the  number  of  states  (n),  the  maximum  performance  measurement  (jz)  and  the 

current  performance  measurement  (p),  the  current  state  is  determined  by. 


To  ensure  proper  operation,  any  performance  measurement  exceeding  p  is  automatically 

categorized  into  the  largest  state.  Note  that  in  this  model,  performance  metrics  are  mapped 

to  states  numbered  from  0  to  n  -  1.  There  is  a  natural  mapping  for  metrics  such .as lateng, 
since  higher  latencv  values  map  to  higher  numbered  states.  The  mapping  may  be  different  for 
metrics  such  as  throughput  where  lower  throughput  values  map  to  higher  nuI«bered  states 
so  that  we  have  a  consistent  interpretation  of  higher  numbered  states  corresponding  to  lower 

Pe  The  Markovian  decision  model  tracks  the  efficiency  of  the  current  mapping  by  calculating 
the  performance  state  of  the  system  as  described  above.  These  performance  states  can 
be  viewed  as  a  Markov  chain,  with  the  application  transitioning  between  them  at  each 
frame  time  based  upon  the  current  data  and  resource  mapping.  As  the  application  moves 
between  the  performance  states,  the  Markov  model  maintains  statistics  aboutthe  state 
transitions.  With  this  information,  at  any  point  during  execution,  the^ Markov  model  has  an 
accurate  picture  of  the  state  transition  probabilities  of  the  Markov  chain.  These  transition 
probabilities  are  conditional  probabilities  for  the  system  to  transition  to  a  particular  state, 
given  the  current  state  of  the  system.  If  there  are  n  states  in  the  Markov  chain  than  the 
collection  of  all  possible  one-step  transitions  can  be  collected  in  an  n  x  n  matnxcMled  the 
state  transition  matrix.  This  state  transition  matrix  can  then  be  used  to  calculate  the  the 
steady-state  probability  vector. 


bo  Pi  Vi 


Pn—2  Pn- lj 


The  steadv-state  probabilities  predict,  based  on  the  current  picture  of  the .system  the 
probability  of  the  system  settling  in  each  state.  This  provides  a  measure  of  the  long- 
global  application  behavior.  While  the  Bayesian  attempts  to  make  locally  oP^mal  deci^ns 
the  steady-state  probabilities  may  show  that  the  system  is  heading  towards  an  increasing!} 
unbalanced  stateP  The  steadv-state  probabilities  are  examined  to  determine  whether  real- 
location's  necessary.  We  nse  the  following  approach  which  accounts  for  the  grad.ent  of 
unacceptable  states.  A  detailed  description  of  the  model  can  be  found  m  17]. 

Using  the  steady-state  vector,  the  following  inner  product  is  computed. 
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n 

i=l 

The  above  computation  produces  a  weighted  state,  u.  If  this  weighted  state  falls  within 
the  set  of  undesirable  states,  then  a  reallocation  is  invoked.  By  basing  its  decisions  upon 
the  steady-state  probabilities,  the  Markovian  decision  model  represents  a  predictive  process, 
while  the  Bayesian  decision  model  represents  a  reactive  process. 


3.3  Model  Parameters 

For  these  decision  models  to  correctly  determine  when  a  remapping  gain  is  probable,  they 
must  have  accurate  knowledge  of  certain  application-specific  parameters.  For  the  Bayesian 
model,  these  are  <j>,  a,  and  (3  as  discussed  earlier,  and  they  are  measures  of  how  often  a 
remapping  gain  is  achievable  and  how  accurately  the  detectors  can  recognize  this  situation. 
For  the  Markovian  model,  the  number  and  range  of  performance  states  and  the  threshold 
between  acceptable  and  unacceptable  states  must  be  defined.  In  addition,  the  statistics  for 
transitions  between  these  states  and  the  duration  of  the  interval  between  Markov  invocations 
are  necessary  for  calculating  the  steady-state  distribution. 

A  detailed  explanation  of  the  experiments  to  determine  these  parameter  values  can  be 
found  in  [7].  For  our  experiments,  the  following  values  were  used  for  the  Bayesian  parameters: 

q  =  0.125037  0  =  0.34588  0  =  0.09500 

Our  experiments  use  Markov  chains  consisting  of  20  states  representing  the  frame  latency. 
To  provide  a  reasonable  tolerance  for  an  occasional  missed  deadline  and  a  stricter  tolerance 
for  any  consecutive  misses,  the  Markov  demarcation  threshold  was  chosen  to  be  12.  The 
frequency  of  change  in  the  input  stream  is  characterized  by  a  parameter  called  the  stability 
interval.  The  stability  interval  refers  to  the  number  of  frames  over  which  the  application 
remains  stable,  i.e.,  does  not  require  reallocation.  As  the  Markovian  decision  model  is 
essentially  sampling  the  input  stream  at  each  invocation,  this  translates  to  a  Markovian 
invocation  interval  of  half  the  length  of  the  input  stability  interval. 


4  Performance  Evaluation 

This  section  compares  the  performance  of  the  decision  models  under  varying  input  conditions. 
Our  experimental  platform  consisted  of  two  components.  An  8-node  IBM  SP-2  was  used  for 
our  empirical  studies  which  generated  the  application  specific  model  parameters.  The  “ATR 
application"  was  a  synthetic  workload  generator  that  can  be  configured  to  represent  a  range 
of  ATR  workloads.  Simulations  were  run  with  the  synthetic  workload  generator  running 
on  a  uniprocessor.  Our  synthetic  workload  generator  allows  us  to  compare  the  different 
models  while  changing  a  number  of  important  input  characteristics.  The  two  primary  input 
characteristics  we  investigated  were  rate  of  change  of  workload  and  the  presence  of  noise. 
Rate  of  change  refers  to  how  frequently  the  input  frames  cause  substantial  workload  changes 
in  the  tasks  of  the  application  requiring  reallocation.  Noise  refers  to  both  the  frequency  and 
size  of  input  workload  spikes.  These  parameters  were  chosen  because  they  are  most  applicable 
to  the  types  of  event-driven  applications  that  these  models  are  designed  to  improve.  They 
are  also  characteristics  for  which  values  can  often  be  ascertained  a  priori.  For  example  we 
may  know  the  maximum  rate  at  which  new  targets  can  appear  in  the  scene  or  we  may  know 
something  about  he  quality  of  the  detectors  and  sensors,  or  be  familiar  with  texture  of  the 
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terrain.  The  factors  affect  the  presence  of  workload  spikes  and  rate  at  which  we  can  expect 
stable  shifts  in  workload. 

Three  sets  of  experiments  were  performed  in  this  section.  The  first  set  of  experiments 
were  designed  to  evaluate  the  improvement  of  the  Bayesian  decision  model  over  the  sim¬ 
ple  model  used  as  our  baseline.  The  second  set  of  experiments  were  designed  to  illustrate 
the  predictive  capabilities  of  the  Markovian  decision  model  and  its  ability  to  improve  ap¬ 
plication  performance  when  used  in  conjunction  with  the  Bayesian  model.  The  final  set 
of  experiments  test  the  fully  coupled  Bayesian  and  Markovian  models  under  various  input 
conditions  characterized  by  their  rate  of  change  and  the  presence  of  noise.  For  comparison 
purposes,  the  baseline  decision  model  refers  to  the  model  where  every  detection  event  invokes 
a  cost  evaluation. 

4.1  Bayesian  model 

The  following  graphs  illustrate  performance  improvements  provided  by  the  Bayesian  model 
over  the  baseline  decision  model.  Figure  5  demonstrates  the  reduction  in  false  detection 
percentage  achieved  by  the  Bayesian  model.  Figures  6  and  7  indicate  the  end-to-end  latency 
of  the  injected  frames  and  the  number  of  resource  reallocations  required  to  achieve  that 
latency  using  each  decision  model.  These  graphs  provide  a  comparison  of  the  number  of 
resource  reallocations  enacted  by  the  decision  models  in  response  to  identical  input  streams. 
Figure  8  plots  the  end-to-end  frame  latencies  for  both  models  on  the  same  axes.  This  graph 
is  used  to  compare  the  overall  performance  of  the  two  models. 


Figure  5:  An  illustration  of  the  false  detection  rate  for  the  two  decision  models. 


It  is  apparent  from  Figure  5  that  the  Bayesian  decision  model  significantly  reduces  the 
percentage  of  unnecessary  invocations  of  the  cost  evaluator.  The  smart  filter  is  able  to 
successfully  pare  the  number  of  detection  events  that  will  not  result  in  a  remapping  gain, 
thereby  reducing  the  amount  of  unnecessary  cost  evaluation  overhead.  Furthermore,  by 
decreasing  the  the  total  number  resource  reallocations,  the  Bayesian  model  also  reduces  the 
amount  of  unnecessary  reallocation  overhead.  This  is  accomplished  by  filtering  the  detectors 
response  to  noise  and  input  spikes.  The  baseline  decision  model  often  reacts  to  input  spikes 
by  reallocating  resources  at  frames  where  the  spike  is  detected  and  on  immediately  successive 
frames.  This  behavior  represents  unnecessary  reallocation  overhead  as  the  input  spikes  are 
transient  and  provide  very  limited  rewards  following  a  resource  reallocation. 
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The  combination  of  reducing  both  false  detection  percentage  and  unnecessary  realloca¬ 
tions  improves  the  behavior  of  the  application  in  a  number  of  ways.  First,  less  computation 
time  is  spent  in  the  cost  evaluation  module,  and  ultimately  more  processing  power  can  e 
used  for  useful  computation.  Second,  by  ensuring  that  the  cost  evaluator  is  invoke  on  y 
when  there  is  a  high  probabilitv  for  a  remapping  gain,  a  more  complex  evaluation  mechanism 
can  be  enacted  with  (relatively)  less  overhead.  The  following  two  figures  compare  the  num¬ 
ber  of  resource  reallocations  required  by  the  baseline  and  Bayesian  models  and  the  latency 
characteristics  under  which  the  reallocations  occur. 


End-To-End  latency  (sec.) 


Figure  6:  An  illustration  of  frame  latency  and  reallocation  points  for  the  baseline  model. 


Figure  7:  An 


End-To-End  latency  (sec.) 


illustration  of  frame  latency  and  reallocation  points  for  the  bayesian  model. 


To  fullv  understand  the  benefits  of  the  Bayesian  decision  model,  one  must  look  at  the 
remapping"  behavior  of  the  system  along  with  the  frame  latency  through  the  system.  Figure 
6  shows  that  the  baseline  decision  model  remaps  a  total  of  29  times,  while  Figure  7  shows 
that  the  Bayesian  decision  model  remaps  a  total  of  18  times.  This  reduction  in  the  num¬ 
ber  of  resource  reallocations  primarily  results  from  the  filtering  of  detection  events  directly 

associated  with  the  presence  of  input  spikes.  ,  , 

Because  of  their  transient  nature,  input  spikes  cause  detection  events  based  upon  con¬ 
ditions  that  do  not  persist  after  the  short  duration  of  the  spike.  Spikes  can  lead  to  new 
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resource  allocations  predicated  on  information  which  is  not  indicative  of  a  long-term  behav¬ 
ioral  change.  Resource  reallocations  resulting  from  input  spikes  do  not  Provide  large  enough 
performance  benefits  to  justify  their  enactment  overhead.  However,  given  that  an  input 
Like  has  alreadv  caused  a  remapping,  it  may  not  be  necessary  to  automatically  remap  after 
the  sp^ke  WcoLT^e^on.  A  spike-based  reallocation  may  not  provide  an  increase 
in  performance  but  it  also  may  not  perturb  the  system  enough  to  cause  a  decrease  in  per 
formance  It  is  not  alwavs  beneficial  to  reallocate  resources  immediately  following  an  input 
Like  : because  the  benefits  may  not  outweigh  the  enactment  overhead.  Because  input  spikes 
produce  eSly  detectable  changes  both  on  the  way  up  and  on  the  way  down,  the  simple 
decision  model  often  invokes  a  resource  reallocation  both  before  and  after  the  spl^r  , 
smart  filter  embedded  in  the  Bayesian  decision  model  allows  it  to  reduce  the  num  er 
raLt  ons  to  LLut  spikes.  Despite  this,  large  or  long-lasting  spikes  will  cause  the  Bayesian 
model  to  react?  In  these  situations,  the  Bayesian  model  may  also  filter  post-spike  re^ctions 
if  the  performance  benefits  are  not  significant  enough  to  warrant  a  ,r«ource ^  real^cat  berB0f 
utilizing  both  pre-  and  post-spike  filtering,  the  Bayesian  model  reduces  the  total  number  or 
resource  reallocations  and  the  effect  of  their  enactment  overhead  on  frame  latency.  Figure  8 
stows  a  comparison  of  end-to-end  frames  latency  for  the  baseline  and  the  Bayes, an  decs, on 

"“n'eure  8  demonstrates  that  the  Bayesian  model  provides  consistently  improved  latency 
performance  over  the  course  of  the  application’s  execution.  The  Bayesian  deci;>10n  mode 
spends  less  time  reacting  to  input  spikes  and  makes  smarter  decis.ons  than  the ^  basd™ 
model  The  benefits  of  the  Bavesian  decision  model  are  twofold.  First,  it  can  pr 
improved  eml-to-end  latency  performance  during  execution  Second^  srgmficanriy  reduces 
the  number  of  resource  reallocations  necessary  to  provide  this  performance. 


End-To-End  latency  (sec.) 
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Figure  8:  A  comparison 


of  frame  latencies  for  the  baseline  and  Bayesian  models 


The  ability  to  filter  the  effects  of  input  spikes  is  at  the  heart  of  the  Bayesian  model  s 
improved  performance  A  trade-off  exists  between  filtering  input  spikes  and  reacting  to  real 
load  changes.  As  shown  in  Figure  7,  the  Bayesian  model  is  not  able  to  filter  very  arge 
i  naA  irmnt  Qnikps  Attempting  to  filter  these  conditions  could  result  in  a  large 

fhe  actual  conSns.  Th/profitag  scheme  used  to  fine  tune  the  Bayes, an  parameters  to 
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this  application  limits  the  occurrence  of  this  situation  to  an  initial  startup  transient  of  the 
Bayesian  model  or  after  extended  periods  of  stable  input  with  no  activity.  An  important 
feature  of  the  Markovian  model  is  its  ability  to  detect  these  remaining  occurrences  and  reduce 
their  effect  to  negligible  levels. 

4.2  Addition  of  the  Markovian  model 

This  section  presents  the  results  of  coupling  the  Bayesian  and  Markovian  decision  models. 
Figure  9  demonstrates  the  potential  of  the  coupled  model  over  the  pure  Bayesian  model. 


End-To-End  latency  (sec.) 
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Figure  9:  A  comparison  of  frame  latencies  between  the  Bayesian  model  and  the  coupled 
model 


As  illustrated,  the  two  experiments  show  similar  performance  until  frame  175.  Up  to  this 
point,  the  application  has  been  conforming  to  the  real-time  specifications  and  the  Markovian 
model  invocations  have  continually  predicted  acceptable  performance.  All  resource  realloca¬ 
tion  have  resulted  from  the  Bayesian  models.  At  frame  175,  the  coupled  system  invokes  the 
Markovian  model  and  it  predicts  that  system  performance  is  heading  toward  an  unaccept¬ 
able  state.  This  results  in  a  remapping  which  is  represented  by  the  latency  drop  in  Figure 
9.  Following  this  decision,  the  coupled  model  has  a  better  resource  allocation  than  the  pure 
Bayesian  model.  This  is  confirmed  by  the  lower  end-to-end  frame  latency  shown  in  Figure  9. 
For  the  next  175  frames,  the  Markovian  model  is  again  inactive  as  the  application  is  within 
the  real-time  bounds.  Both  systems  are  again  making  purely  Bayesian  decisions,  which  ac¬ 
counts  for  the  similarity  in  the  shapes  of  the  two  graphs.  When  the  Markovian  model  is 
invoked  at  frame  350,  it  again  predicts  that  the  system  is  heading  toward  an  unacceptable 
state.  Another  Markovian  reallocation  occurs,  which  accounts  for  the  slight  deviation  in  the 
shape  of  the  graphs  between  frames  350  and  400.  The  pure  Bayesian  system  shows  an  in¬ 
creasing  latency  slope  beginning  around  frame  360,  while  the  coupled  system  does  not  show 
an  increase  in  latency  until  around  frame  385. 

These  results  clearly  show  the  ability  of  the  Markovian  model  to  monitor  the  global  appli¬ 
cation  performance  and  initiate  a  new  resource  allocation  when  the  real-time  specifications 
are  threatened.  This  predictive  capacity  makes  the  Markovian  model  perfectly  suited  to  act 
as  a  watchdog  over  the  Bayesian  model. 
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4.3  Coupled  model  performance 

This  section  presents  a  final  set  of  results.  We  have  demonstrated  both  the  benefits  of  the 
Bavesian  model  over  the  baseline  model  and  the  benefits  of  adding  a  Markovian  watchdog 
to  the  Bayesian  process.  We  now  investigate  the  performance  of  the  coupled  decision  model 
with  respect  to  two  important  input  parameters:  input  rate  of  change  and  noise,  these 
two  conditions  represent  important  dimensions  of  the  applications  that  may  utilize  adaptive 
resource  management.  For  our  ATR  application,  we  want  to  maintain  an  acceptable  frame 
rate  in  the  presence  of  both  a  high  number  of  targets  (input  rate)  and  a  large  amount  ol 

scenic  variations  (input  noise).  . 

A  particular  input  stream  can  be  described  in  terms  of  the  range  of  rate  of  change 
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Figure  10:  A  representation  of  the  input  characteristics  for  our  experiments  and 
reference  to  the  corresponding  figures. 


Our  first  experiments  consisted  of  testing  the  coupled  decision  model  on  "average  input 
streams  with  median  values  for  rate  of  change  and  noise.  The  input  streams  used  in  these 
experiments  have  stability  intervals  on  the  order  of  50  frames  and  input  spike  probabilities  on 
the  order  of  15  percent.  Performance  results  from  these  experiments  comparing  the  behavior 
of  the  coupled  and  baseline  models  are  provided  in  Figure  11. 

As  evidenced  in  Figure  11,  the  coupled  decision  model  is  able  to  improve  end-to-end 
frame  latencv  throughout  the  course  of  execution.  Under  conditions  containing  median 
levels  for  both  input  rate  of  change  and  noise,  both  the  Bayesian  and  the  Markovian  models 
contribute  toward  improved  performance.  By  reducing  the  total  number  of  decisions  and 
improving  decision  quality  and  timing,  latency  performance  is  improved  while  decision  and 

enactment  overhead  is  reduced.  ,  ....  , . 

The  following  two  experiments  investigate  input  streams  with  a  low  probability  ol  input 
noise.  The  average  probability  of  input  spikes  used  in  these  experiments  was  5  percent.  We 
further  divided  these  experiments  into  input  streams  containing  either  a  high  or  low  rate 
of  input  change.  Low  rate  of  change  input  streams  used  stability  intervals  on  the  order  oi 
100  frames,  while  high  rate  of  change  input  streams  used  stability  intervals  on  the  order 
of  20  frames  Figure  12  compares  the  results  from  the  low  noise  and  low  rate  of  change 
experiments,  and  Figure  13  compares  the  results  from  the  low  noise  and  high  rate  of  change 
experiments. 
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End-To-End  latency  (sec.) 


Figure  11:  A  comparison  of  frame  latencies  between  the  baseline  model  and  the  cou 
pled  model  under  average  input  conditions. 

End-To-End  latency  (sec.) 


Figure  12:  A  comparison  of  frame  latencies  between  the  baseline  model  and  the  cou¬ 
pled  model  with  low  rate  and  low  noise. 


Under  low  noise  conditions,  the  coupled  decision  model  performs  significantly  better 
than  the  baseline  decision  model.  The  reduction  in  input  spikes  resulting  from  the  low 
noise  behavior  reduces  the  dependence  on  the  Bayesian  smart  filter.  In  these  experiments, 
the  Markovian  model  contributes  more  significant  decisions  than  the  Bayesian  model.  Low 
levels  of  noise  increase  the  accuracy  of  the  Markovian  predictions.  This  allows  the  Markov 
model  to  make  better  decisions  about  when  and  where  to  best  allocate  resources.  Figure  12 
demonstrates  that  good  Markovian  decisions  can  significantly  improve  the  performance  in  a 
low  rate  of  change  input  stream.  Because  of  the  low  rate  of  change,  both  the  baseline  and 
Bayesian  models  do  not  receive  many  detection  events  and  therefore  do  not  trigger/enact 
many  reallocations.  The  Markovian  model  is  able  to  push  the  system  into  a  more  globally 
efficient  state  which  persists  because  of  the  low  input  variation.  Figure  13  demonstrates 
that  in  a  low  noise  environment  the  coupled  decision  model  is  also  effective  for  a  high  rate 
of  input  change.  By  adjusting  the  sampling  frequency  of  the  Markovian  model  to  account 
for  the  increased  input  data  rate,  the  coupled  model  is  able  improve  the  performance  of  the 
application  over  a  large  number  of  frames. 

The  final  two  experiments  investigate  input  streams  with  a  high  probability  of  input 
noise.  The  average  probability  of  input  spikes  used  in  these  experiments  was  50  percent.  We 
further  divided  these  experiments  into  input  streams  containing  either  a  high  or  low  rate 
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End-To-End  latency  (sec.) 


Figure  13:  A  comparison  of  frame  latencies  between  the  baseline  model  and  the  cou 
pled  model  with  high  rate  and  low  noise. 


of  input  change.  Low  rate  of  change  input  streams  used  stability  intervals  on  the ;  order ^of 
100  frames  while  high  rate  of  change  input  streams  used  stability  intervals  on  the  orde 
of  20  frames  Figure  14  compares  the  results  from  the  high  noise  and  low  rate  of  chang 
experiment^  and^Figure  15  compares  the  results  from  the  high  noise  and  high  rate  of  change 

experiments. 


End-To-End  latency  (sec.) 


Figure  14:  A  comparison  of  frame  latencies  between  the  baseline  model  and  the  cou 
pled  model  with  low  rate  and  high  noise . 


Because  of  the  high  levels  of  noise  in  these  two  experiments,  the  improvements  in  end- 
to-end  frame  latencv  are  not  as  prominent  as  in  previous  experiments.  Under  these  inpu 
conditions,  the  Bavesian  model  is  more  effective  than  the  Markovian  model.  The  frequency 
of  the  input  spikes  limits  the  accuracy  of  the  Markovian  predictions.  The  increased  noise  also 
results  ^n  an  gieaterTumber  of  lower-level  resource  reallocations.  These  additional  realloca¬ 
tions  ensure  that  the  Markov  statistics  are  initialized  more  frequently,  thereby  limiting  the 
,  ,,  offprtivpnpss  as  a  predictor  However,  the  Markovian  model  does  serve  an  important 
nurrmse  in ^hese ^periments^1^ the  embedded  smart  filter  in  the  Bayesian  model  attempts 
to  filter  out  the  input  noise,  the  Markovian  model  acts  as  a  backup  to  ensure  that  it  do^s  ™ 
filter  anv  significant  load  changes.  Any  sustained  performance  levels  violating  the  real-tim 
specifications  wall  still  be  corrected  by  the  Markovian  model. 
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End-To-End  latency  (sec.) 


Figure  15:  A  comparison  of  frame  latencies  between  the  baseline  model  and  the  con 
pled  model  with  high  rate  and  high  noise. 


While  these  figures  demonstrate  improved  latency  performance  for  the  coupled  decision 
model,  additional  benefits  are  not  apparent  from  the  graphs.  The  filtering  properties  of  the 
Bayesian  model  along  with  the  Markovian  backup  allow  the  coupled  model  to  achieve  better 
latency  performance  in  a  fraction  of  the  cost  evaluations  and  resource  reallocations  required 
by  the  baseline  model. 

Our  experiments  demonstrate  the  effectiveness  of  the  coupled  model  across  a  range  of 
input  conditions.  The  total  number  of  resource  reallocations  and  the  number  of  false  detec¬ 
tions  are  both  significantly  reduced.  These  reductions  are  accomplished  while  maintaining 
improved  latency  performance.  The  reduction  in  both  detection  and  enactment  overhead 
allow  more  processing  cycles  for  useful  work  without  sacrificing  any  latency  performance. 
This  is  significant  in  that  these  experiments  are  not  based  on  a  fixed  number  of  processor 
cycles  distributed  between  useful  computation  and  allocation.  In  a  practical  implementation 
with  real  applications,  we  expect  that  the  latency  improvements  using  a  fixed  number  of 
processors  will  be  greater  since  the  cycles  saved  bv  effective  detection  and  prediction  will 
directly  reduce  the  latency. 


5  Conclusions  and  Future  Research 

Clearly,  there  is  a  need  for  efficient  adaptive  resource  mechanisms  to  be  used  with  data- 
dependent,  real-time  applications.  The  mechanisms  must  be  responsive  to  change  and  yet 
accurate  in  their  remapping  requests.  These  quality  requirements  place  a  great  deal  of 
pressure  on  the  remapping  decision  model.  Current  implementations  of  simple  decision 
models  might  not  be  able  to  meet  increasingly  stringent  real-time  requirements.  This  paper 
proposes  an  improved  decision  process  to  provide  increasingly  accurate  resource  mappings 
while  maintaining  low  decision  latency  and  overhead. 

Experiments  using  a  synthetic  workload  generator  and  the  statically  defined  model  pa¬ 
rameters  yielded  promising  results  in  multiple  categories.  An  overall  reduction  in  the  per¬ 
centage  of  unsuccessful  invocations  of  the  cost  evaluator  and  number  of  unnecessary  resource 
reallocations  was  realized  with  the  Bayesian  decision  model.  This  allows  more  cycles  for  use¬ 
ful  computation  and  can  mask  the  use  of  the  more  complex  Markovian  decision  process. 
Experiments  with  frame  latency  showed  similar  or  improved  performance  compared  with 
the  simple  decision  model  for  a  significantly  lower  number  of  remappings. 

By  coupling  the  reactive  Bayesian  model  with  the  predictive  Markovian  model,  we  cre¬ 
ate  a  multi-level  decision  model  capable  of  improving  the  performance  of  adaptive  resource 
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managers  under  a  variety  of  input  conditions.  Under  average  input  conditions,  both  models 
contribute  to  decrease  the  end-to-end  latency  of  input  frames  and  reduce  the  decision  and 
enactment  overhead.  Toward  the  extremes,  the  Bayesian  model  proves  more  applicable  to 
high  noise  environments  and  the  Markovian  model  better  suited  to  low  noise  environments. 
In  these  situations,  the  less  suited  model  provides  good  backup  support ■  {“  the  i 5  ,  , 

model  Under  low  noise  conditions,  the  Bayesian  level  keeps  track  with  the  baseline  mode 
while  the  Markovian  level  pushed  the  system  toward  more  acceptable  performance  states. 
Under  high  noise  conditions,  the  Bayesian  level  filters  a  much  larger  percentage  of  the  in¬ 
put  spikes  while  the  Markovian  level  ensured  performance  did  not  fall  below  the  real-time 
specifications.  Over  a  wide  range  of  input  streams,  the  coupled  model  is  shown  to  maintain 
or  improve  the  latency  performance  while  decreasing  the  number  of  false  detections  and 

unnecessary  resource  reallocations.  „  „ 

Future  Work  in  the  context  of  this  system  will  include  methods  for  dynamically  varying  the 
Bavesian  and  Markovian  thresholds  in  response  to  the  current  task-level  resource  allocati  . 
We  also  plan  to  implement  mechanisms  allowing  the  Markovian  model  to  sug^st  aPPr0P^ 
resource  allocations  for  the  steady-state  behaviors  it  currently  predicts.  In  addition,  we  are 
currently  working  on  an  3-D  tracking  system  that  will  allow  us  to  test  these  decision  models 
in  the  framework  of  an  actual  application. 
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ABSTRACT 

Mission-critical  distributed  applications  must  be  able  to  adapt  to  mission  dependent  variations  in  resource 
demands  as  well  as  dynamic  changes  in  resource  availability.  A  middleware  layer  designed  to  provide  QoS-aware 
resource  management  facilitates  application  development  and  follows  the  current  industry  trend  towards  cost- 
effective  COTS-based  implementations.  This  paper  presents  the  Real  Time  Adaptive  Resource  Management  system 
(RTARM1),  developed  at  the  Honeywell  Technology  Center.  The  RTARM  system  supports  provision  of  integrated 
services  for  real-time  distributed  applications  and  offers  services  for  end-to-end  QoS  negotiation,  QoS  adaptation, 
real-time  application  QoS  monitoring  and  hierarchical  QoS  feedback  adaptation.  In  this  paper,  we  focus  on  the 
hierarchical  architecture  of  RTARM,  its  flexibility,  internal  mechanisms  and  protocols  that  enable  management  of 
resources  for  integrated  services.  The  architecture  extensibility  is  emphasized  with  the  description  of  several  service 
managers,  including  an  object  wrapper  build  around  the  NetEx  real-time  network  resource  management  system 
developed  by  the  Texas  A&M  University.  We  use  practical  experiments  with  a  distributed  Automatic  Target 
Recognition  application  and  a  synthetic  pipeline  application  to  illustrate  the  impact  of  RTARM  on  the  application 
behavior  and  to  evaluate  the  system's  performance. 

Key  words:  adaptive  resource  management,  distributed  real-time  applications,  integrated  services,  QoS  negotiation 
and  adaptation,  hierarchical  feedback  adaptation 
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ABSTRACT 

Mission-critical  distributed  applications  must  be  able  to  adapt  to  mission  dependent  variations  in  resource 
demands  as  well  as  dynamic  changes  in  resource  availability'.  A  middleware  layer  designed  to  provide  QoS-aware 
resource  management  facilitates  application  development  and  follows  the  current  industry  trend  towards  cost- 
effective  COTS-based  implementations.  This  paper  presents  the  Real  Time  Adaptive  Resource  Management  system 
(RTARM),  developed  at  the  Honeywell  Technology  Center.  The  RTARM  system  supports  provision  of  integrated 
services  for  real-time  distributed  applications  and  offers  services  for  end-to-end  QoS  negotiation,  QoS  adaptation, 
real-time  application  QoS  monitoring  and  hierarchical  QoS  feedback  adaptation.  In  this  paper,  we  focus  on  the 
hierarchical  architecture  of  RTARM,  its  flexibility,  internal  mechanisms  and  protocols  that  enable  management  of 
resources  for  integrated  services.  The  architecture  extensibility  is  emphasized  with  the  description  of  several  service 
managers,  including  an  object  wrapper  build  around  the  NetEx  real-time  network  resource  management  system 
developed  by  the  Texas  A&M  University.  We  use  practical  experiments  with  a  distributed  Automatic  Target 
Recognition  application  and  a  synthetic  pipeline  application  to  illustrate  the  impact  of  RTARM  on  the  application 
behavior  and  to  evaluate  the  system’s  performance. 

Key  words:  adaptive  resource  management,  distributed  real-time  applications,  integrated  services,  QoS  negotiation 
and  adaptation,  hierarchical  feedback  adaptation 
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1.  Introduction 

Current  distributed  mission-critical  environments  employ  heterogeneous  resources  thatare  shared  by  a  host  of 
diverse  applications  cooperating  towards  a  common  mission  goal.  These  applications  are  generally  a  mix  of  hard-, 
soft-  and  non-real-time  applications  with  different  levels  of  criticality  and  have  a  variety  of  structures,  ranging  from 
periodic  independent  tasks,  multimedia  streams  and  parallel  pipelines,  to  event-driven  method-invocation 
communicating  modules.  The  applications  usually  tolerate  a  range  of  Quality  of  Services  (QoS)  and  are  ready  to 
trade  off  QoS  in  favor  of  the  most  critical  functions  they  perform.  The  distributed  systems  must  be  able  to  evolve 
and  adapt  to  the  high  variability  in  resource  demands  and  criticality  of  the  applications  as  well  as  to  the  changing 
availability  of  resources. 

The  current  industry  trend  is  to  build  distributed  environments  for  mission-critical  applications  using  “Common- 
Off-the-Shelf’  (COTS)  commercial  hardware  and  software  components.  A  middleware  layer  above  the  COTS 
components  provides  consistent  management  for  the  system  resources,  decreases  complexity  and  reduces 
development  costs. 

This  paper  presents  the  Real  Time  Adaptive  Resource  Management  system  (RTARM),  developed  at  the 
Honeywell  Technology  Center,  that  implements  a  general  middleware  architecture/framework  for  adaptive 
management  for  integrated  services  aimed  to  real-time  mission-critical  distributed  applications. 

The  RTARM  system  has  the  following  basic  features  [4]:  (1)  scalable  end-to-end  criticality-based  QoS  contract 
negotiation  that  allows  distributed  applications  to  share  common  resources  while  maximizing  their  utilization  and 
execution  quality;  (2)  end-to-end  QoS  adaptation  that  dynamically  adjusts  application  resource  utilization 
according  to  their  availability  while  optimizing  application  QoS;  (3)  integrated  services  for  CPU  and  network 
resources  with  end-to-end  QoS  guarantees;  (4)  real-time  application  QoS  monitoring  for  integrated  services  and  (5) 
plug-and-play  architecture  components  for  easy  extensibility  for  new  services. 

The  resource  management  architecture  for  RTARM  uses  an  innovative  approach  that  unifies  heterogeneous 
resources  and  their  management  functions  into  a  hierarchical  uniform  abstract  service  model  [4].  The  building 
block  of  the  architecture  is  the  Service  Manager  (SM).  It  encapsulates  a  set  of  services  and  their  management 
functions  and  exports  a  common  interface  to  clients  and  other  service  managers.  This  facilitates  recursive 
hierarchies,  in  which  heterogeneous  services  are  integrated  bottom-up.  A  higher-level  service  manager  aggregates 
serv  ices  provided  by  itself  and  its  lower-level  SMs  and  provides  clients  with  a  higher-level  QoS  representation. 

In  this  paper,  we  focus  on  the  architecture,  protocols  and  implementation  of  an  RTARM  prototype  that  supports 
integrated  services  for  real-time  distributed  applications.  It  runs  as  a  middleware  on  a  network  of  workstations  and 
uses  CORBA  for  portable  communication.  A  major  contribution  of  our  work  is  the  hierarchical  feedback  adaptation 
mechanism  [1]  that  provides  efficient  dynamic  QoS  control  for  distributed  data-flow  applications.  We  illustrate  the 
RTARM  capabilities  with  a  practical  experiment  with  an  Automatic  Target  Recognition  (ATR)  [8]  distributed 
application  and  with  a  synthetic  pipeline  demonstration  application. 
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Other  efforts  for  building  adaptive  management  systems  for  heterogeneous  resources  are  GRMS  [5,6],  ARA 
[8,10],  and  QualMan  [9],  GRMS  is  a  precursor  of  RTARM.  It  introduced  the  uniform  resource  model  and  the 
atomic  ripple  scheduling  protocol.  Its  hierarchical  architecture  reflects  the  application  data  flow  and  does  not  offer 
feedback  adaptation.  ARA  considers  a  discrete  set  of  runtime  configurations  for  distributed  applications  and  does 
feedback  adaptation  by  resource  reallocation.  The  ARA  architecture  is  non-recursive  and  differs  considerably  from 
the  uniform  RTARM  architecture  by  using  proxies  for  specific  service  providers.  QualMan  is  designed  for 
multimedia  applications  and  defines  two  basic  resource  management  components,  the  resource  scheduler  and  the 
QoS  broker,  that  adhere  to  a  uniform  resource  model  without  considering  deeper  recursive  structures  and  QoS 

composition. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  describes  the  RTARM  hierarchical  architecture,  system 
models  and  interfaces.  Section  3  presents  the  architecture  of  a  SM  and  describes  the  CPU,  network  and  a  higher- 
level  SM.  Section  4  continues  with  experiments  involving  an  ATR  application  and  synthetic  pipeline  applications 
that  emphasize  the  RTARM  capabilities.  The  paper  concludes  in  Section  5  with  a  discussion  and  future  plans. 

2.  The  RTARM  System  Architecture 

We  have  designed  and  implemented  the  RTARM  system  prototype  as  a  middleware  layer  above  the  operating 
system  and  network  resources.  The  middleware  approach  provides  the  benefit  of  flexibility  and  portability  but  the 
increased  distance  to  the  basic  resources  makes  fine-grained  control  difficult.  The  RTARM  servers,  developed  in 
C++,  run  as  user-level  processes  on  Windows  NT  workstations  and  export  a  CORBA  (Orbix  [7])  interface  to  clients 
and  applications.  The  RTARM  model  differentiates  between  clients  and  applications.  A  client  is  any  entity  that 
issues  a  request  for  services  and  negotiates  a  QoS  contract  that  defines  the  allocated  services.  An  application 
consumes  services  reserved  by  a  client  on  its  behalf  and  continuously  cooperates  with  the  resource  management 
system  to  achieve  the  best  available  QoS  while  maintaining  its  runtime  parameters  within  the  contracted  region. 
The  QoS  contract  may  change  during  the  application  lifetime. 

2.1  The  Service  Manager  Hierarchy 

The  RTARM  system  employs  a  hierarchical  resource  management  architecture  that  facilitates  provision  of 
integrated  services  over  heterogeneous  resources.  The  uniform  resource  model  [4]  defines  a  recursive  structural 
entity  called  Service  Manager  (SM)  that  encapsulates  a  set  of  resources  and  their  management  mechanism.  At  the 
bottom  of  the  hierarchy  are  SMs  that  provide  management  functions  for  basic  resources,  such  as  CPU  or  network 
resources,  and  directly  control  resource  utilization  by  application  components.  Higher  level  services  are  assembled 
on  top  of  lower-level  services,  giving  rise  to  a  service  hierarchy. 

Resources  as  well  as  negotiation  requests  are  treated  uniformly  across  the  entire  hierarchy.  Higher-level  service 
managers  (HSM)  may  act  as  clients  for  lower-level  SMs  (LSM).  The  hierarchy  allows  dynamic  configuration  as 
new  service  managers  can  join  the  system  at  any  time.  A  request  for  an  integrated  service  sent  to  an  HSM  may 
require  resources  from  lower-level  service  providers.  The  admission  protocol  builds  a  virtual  spanning  tree  over  the 
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SM  hierarchy  that  remains  valid  for  the  entire  application  lifetime.  The  SM 
hierarchy  forms  a  directed  acyclic  graph,  with  SM  as  nodes  and  edges 
represented  by  the  “uses-services-from”  relation. 

Figure  1  illustrates  a  simple  RTARM  hierarchy  with  two  LSMs,  a  CPU  and 
a  Network  SM,  at  the  bottom  of  the  hierarchy.  Two  clients  request  services 
from  the  two  HSMs  while  applications  are  consuming  CPU  and  network 
resources.  Section  3  describes  the  service  managers  in  more  detail. 

There  are  several  benefits  from  a  hierarchical,  recursive,  resource 
management  architecture.  First,  services  with  complex  QoS  representations  are 
easier  to  implement  on  top  of  basic  services.  Complex  distributed  applications 


Figure  1.  Sample  RTARM 
hierarchy 


benefit  from  a  richer 


representation  of  QoS.  It  simplifies  the  application  design  and  facilitates  consistent  resource 


management  for  QoS-incompatible  applications.  Regardless  of  how  complex  the  application  architecture  and  QoS 


semantics  are  at  the  top  of  the  SM  hierarchy,  at  the  bottom  of  the  hierarchy  everything  translates  to  QoS  requests 


for  basic  services  (CPU  and  network  in  our  implementation). 

The  hierarchical  architecture  of  RTARM  scales  well  with  large  distributed  environments.  Many  SMs  grouped  in 
clusters  may  benefit  from  service  localization  and  avoid  communication  bottlenecks.  Sharing  of  LSMs  between 
HSMs  adds  redundancy,  fault  tolerance  and  load  balancing. 

A  potential  drawback  for  deep  SM  hierarchies  comes  from  the  increased  distance  between  the  top-most-level 
SM  and  bottom  layer  in  the  hierarchy.  This  may  cause  high  latency  for  time  sensitive  RTARM  functions,  such  as 
feedback  adaptation  and  application  control. 

Issues  related  to  deadlock  prevention  and  distributed  SM  synchronization  have  been  studied  for  the  GRMS 
project  [5,6]  and  can  be  easily  extended  to  the  RTARM  model. 


2.2  RTARM  System  Models 
QoS  Model  and  Translation 

The  quality  of  the  interaction  of  a  mission-critical  application  with  a  dynamic  environment  directly  reflects  its 
performance.  The  wide  magnitude  of  this  interaction  requires  a  range  for  the  quality  measures.  RTARM  supports  a 
multidimensional  QoS  representation,  each  dimension  specifying  an  acceptable  range  [Qmjn,  Qmax]  of  a  quality 
parameter  for  the  application.  A  set  of  range  specifications,  one  per  dimension,  defines  a  QoS  region.  This  QoS 
model  facilitates  resource  negotiation  and  makes  resource  management  more  flexible. 

In  the  RTARM  recursive  hierarchy,  the  QoS  representation  at  a  SM  reflects  the  type  of  services  provided  by  that 
SM.  An  HSM  translates  a  QoS  request  for  integrated  services  into  individual  QoS  requests  for  services  provided  by 
itself  and  its  lower-level  SMs.  When  the  SM  receives  replies  from  its  LSMs,  it  reassembles  the  returned  QoS  into 
its  own  QoS  representation  in  a  process  called  QoS  reverse-translation. 

RTARM  uses  a  unique  implementation  for  QoS,  which  is  independent  of  the  addressed  service.  We  define  a 
QoS  parameter  as  a  set  of  name-value  pairs,  where  the  value  part  is  a  sequence  of  one  or  more  scalar  primitive  data 
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values  (string,  short,  double,  etc.)  and  the  name  indicates  the  specific  QoS  dimension,  such  as  “rate”,  “workload  , 
“latency”,  etc.. 

Adaptation  Model 

RTARM  recognizes  three  situations  when  application  QoS  may  be  changed  after  admission  [4].  (la)  QoS 
shrinking/reduction  of  lower  criticality  applications  when  a  new  application  comes;  (lb)  QoS 
expansion/improvement  when  applications  depart  and  release  resources,  and  {2)  feedback  adaptation.  While  (la) 
and  (lb)  imply  contract  changes  and  involve  other  applications,  feedback  adaptation  does  not  change  the  contract 
but  only  varies  the  current  operational  point  of  the  application  within  the  contracted  QoS  region.  Feedback 
adaptation  is  like  closed  loop  control.  It  relies  on  monitoring  of  delivered  QoS  and  uses  the  difference  between 
delivered  and  desired  QoS  to  adapt  the  application  behavior. 


23  RTARM  Interfaces 

Each  SM  implements  and  exports  three  interfaces:  (1  Negotiator  for  admission  control,  collateral  adaptation, 
QoS  expansion  and  application  control,  such  as  suspend,  resume  and  end;  (2)  service  Manager  for  SM  hierarchy 
set  up  (register/deregister  SM)  and  (3)  Monitor  for  application  monitoring  and  event  propagation. 

For  admission  control  and  adaptation  RTARM  uses  a  modified  version  of  the  GRMS  Ripple  Scheduling 
algorithm  [5,6],  It  consists  of  a  transaction-based  two-phase  commit  protocol  applied  recursively  at  each  SM.  The 
first  phase  executes  a  service  availability  test  starting  from  the  SM  that  received  the  admission  request,  down  on  the 
spanning  tree  that  resulted  from  the  QoS  translation  and  request  dispatch  process.  The  available,  reserved  QoS 
propagates  back  to  the  initiator  SM  from  the  lowest  SM  layer,  being  reverse-translated  along  the  way.  In  the  second 
phase,  the  initiator  SM  assesses  the  success  status  of  the  reservation  phase  and  the  transaction  is  committed  or 
aborted,  implying  service  reservations  along  the  spanning  tree  to  be  committed,  or  to  be  cancelled,  respectively.  If 
not  enough  resources  are  available,  a  SM  tries  to  adapt  lower  criticality  applications  at  their  minimum  contracted 
QoS  and  use  the  released  resources  for  the  new  application.  Later,  when  resources  become  available,  the  SM 
expands  the  QoS  for  the  most  critical  applications. 

Sometimes  in  order  to  admit  a  new,  more  critical  application,  it  is  enough  to  squeeze  the  QoS  of  only  a  part  of 
an  existing  distributed  application.  Then  changes  in  the  high-level  QoS  may  require  collateral  adaptation  of  other 
components  of  the  application  that  do  not  directly  impact  admission  of  the  new  application.  For  instance,  for  a 
multimedia  stream  application  having  frame  rate  as  QoS  parameter,  if  one  processing  stage  is  adapted  to  the 
minimum  rate,  than  all  other  stages  will  run  at  the  same  low  rate. 

Next  follows  the  list  of  calls  from  the  RTARM  CORBA  interfaces: 


The  RTARM  Negotiator  interface  for  admission  consists  of  the  following  set  of  calls: 

.  boolean  admit_app(in  aPPId,  in  request,  out  admittedQoS)  -  admit  new  application;  it  embeds 
both  phases  of  the  ripple  scheduling  algorithm.  Return  admission  status. 

.  boolean  test  reservation (in  appId,  in  request,  out  admittedQoS,  in  candidateApps,  out 
shrinkUsed,  out  adaptedAppsQoS,  in  hsmName)  -  phase  I  of  admission  protocol,  try  and  reserve 

resources.  Adapt  candidateApps  if  necessary.  Return  admission  status,  QoS  and  adaptedApps. 
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•  boolean  commit  reservation (in  appld,  in  operationalQoS ,  in  adaptedApps ,  out  shrunkAppQoS) 
and  boolean  cancel_reservation  (in  appld,  in  adaptedApps)  —  phase  II*.  comm  it/cancel  reservation. 


The  RTARM  Negotiator  interface  for  collateral  shrink: 

•  boolean  test  adapt  (in  newAppName,  in  appsToAdapt)  —  phase  I  for  collateral  adaptation:  shrink  QoS 
for  the  applications  in  appsToAdapt  list,  mark  for  newAppName. 

•  boolean  commit_adapt (in  newAppName,  in  appsToAdapt,  out  adaptedApps QoS)  and 

•  boolean  cancel_adapt  (in  newAppName,  in  appsToAdapt)  —  phase  II:  comm  it/cancel  collateral 
adaptation  for  applications  in  appsToAdapt  list. 

The  RTARM  Negotiator  interface  for  QoS  Expansion: 

•  boolean  test_expansion  (in  appld,  out  availableQoS)  —  phase  I:  try  expansion  for  application  appld 
and  return  status  and  availableQoS.  If  success,  then  services  have  been  reserved. 

•  boolean  commit_expansion (in  appld,  in  commitedQoS)  and 

•  boolean  cancel_expansion  (in  appld)  — phase  II:  commit  expansion  to  committedQoS  or  cancel. 


The  Negotiator  interface  for  application  control: 

•  void  end_app(in  appld) 

•  void  suspend_app (in  appld) 

•  void  resume_app (in  appld) 

•  void  set_qos(in  appld,  in  newqos) 

•  QoS  get_qos(in  appld) 


terminate  application, 
suspend  application, 
resume  execution, 
change  application  QoS. 
get  application  QoS. 


Service  manager  interface  for  the  SM  hierarchy  setup: 

•  boolean  register  lsm(in  name,  in  myParams,  out  hsmParams,  out  monitor)  —  register  self  as  an 

LSM  with  the  CORBA  name  at  an  HSM .  Return  HSM  parameters  and  HSM  monitor.  The  SM  parameters 
include  server  name  and  a  list  of  services  it  provides  (cpu,  network,  pipeline,...). 

•  boolean  register_hsm  (in  name,  in  myParams,  out  IsmParams,  in  monitor)  —  register  self  as  an 

HSM  with  the  CORBA  name  at  an  LSM.  Pass  my  parameters  and  my  monitor.  Return  LSM  parameters. 

•  boolean  deregister  sm(in  smName)  —  remove  SM  with  name  smName  from  the  list  of  SMs. 

A  SM  cannot  register  twice  to  the  same  SM,  but  can  be  LSM  and  HSM  for  SMs  in  two  distinct  sets. 

The  Monitor  interface  for  event  communication  and  QoS  reporting: 

•  oneway  void  event  (in  appld,  in  originator,  in  event,  in  type)  —  send  event  to  SM  Monitor. 


The  next  section  presents  the  object  architecture  of  the  SM  and  details  the  implementation  of  a  CPU,  a  Network 
and  a  Higher-level  SM. 


3.  RTARM  Service  Managers 

3.1  The  Service  Manager  Architecture  and  Implementation 

The  unified  resource  model  provides  the  benefits  of  a  uniform  interna!  architecture  for  all  service  managers 
(shown  in  Figure  2)  and  a  common  interface  between  them. 
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FB-Adapter  M  Detector 


Involved  In  Feedback 
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Figure  2.  The  internal  object  architecture  of  a  service  manager 

The  arrows  in  the  figure  indicate  object  service  requests.  The  components  in  a  SM  are  as  follows: 

•  Negotiator:  brokers  contract  admission,  delegates  responsibilities  to  other  components  and  exports  external 
RTARM  CORBA  interface. 

•  Translator:  translates  higher-layer  integrated  QoS  into  lower-layer  QoS  representation. 

•  Allocator:  handles  resource  allocation/release  when  no  adaptation  is  necessary. 

•  Adapter:  handles  resource  allocation/release  with  adaptation  and  QoS  expansion/contraction. 

•  Scheduler:  determines  whether  allocation  of  resources  and  expansion  of  application  QoS  are  feasible. 

•  Enactor:  enforces  changes  in  application  QoS  or  status. 

•  Monitor:  keeps  an  eye  on  applications  in  execution  and  passes  status  information  and  QoS  usage  to  the 
Detector.  Exports  external  RTARM  CORBA  interface. 

•  Detector:  uses  application  runtime  information  (e.g.  current  QoS  operational  point)  to  detect  significant 
changes  in  application  operation  (e.g.  overload,  underutilization,  contract  violation).  Triggers  Feedback 

Adapter  actions. 

•  Feedback  Adapter:  decides  corrective  actions  for  applications  when  their  runtime  status  changes 
significantly. 

Additional  data  structures  exist  to  hold  information  regarding  application  contracts,  other  service  managers  and 
available  services. 

As  an  illustration  of  the  SM  component  interaction,  Figure  3  shows  object  collaboration  diagrams  for  two 

relevant  interface  calls  for  admission,  test_reservation  ()  and  commit_reservation  ( ) ,  as  they  implement 

phases  1  and  11  of  the  admission  protocol  for  a  CPU  SM. 


i  reservation! QoS)  I  success  =tesi  reservation)  1  •  anal  yzc_jchedulabi  1  it\Q_ 


commit  reservation  QoS.  adapt edApps) 


1.  |adaptedApps,=0] 
commit  reservation  (QoS) 


,;/v  - 

Hv.N 


a) Negotiator : : test_reservation I 


3.  sct_qos(QoS)  |  ^ 


3.1  set_qos(W _ 

I  Application  Proxy  j 

b)  Negotiator :  :  commi preservation  ( ) 


Fisure  3.  Sample  collaboration  diagrams  for  the  Negotiator  admission  interface 
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Applications  implement  a  simple  CORBA  interface  that  allows  SMs  to  change  their  QoS  and  status.  LSMs  keep 
proxies  for  the  application  CORBA  server  objects.  All  RTARM  CORBA  servers  and  applications  are  started  in  the 
shared,  multi-client  activation  mode. 

A  SM  component  class  has  the  same  object  interface  regardless  of  the  SM  position  in  the  hierarchy  or  the 
resources  the  SM  controls.  For  instance,  the  Adapter  object  implements  the  same  functions  in  all  SMs,  but  in  a 
way  that  depends  actually  on  the  scope  of  the  SM.  Not  all  components  are  required  within  a  SM.  For  example,  a 
Translator  may  exist  only  inside  an  HSM. 

RTARM  provides  a  common  object  oriented  execution  framework  that  allows  users  to  dynamically  load  SM 
components  from  shared  libraries  during  runtime  configuration.  A  configuration  manager  uses  a  mechanism  similar 
to  a  Factory  Method  [3]  to  instantiate  SM  components.  It  also  passes  configuration  information  extracted  from  a 
configuration  file  to  the  SM  components  during  their  initialization.  For  all  SMs  there  is  a  single  executable  program 
that  originally  contains  the  empty  SM  framework  and  the  configuration  manager.  By  loading  specialized 
components  from  shared  libraries,  the  configuration  manager  practically  starts  different  SMs.  We  use  this  technique 
when  we  initialize  the  CPU,  Network  and  Higher-level  SMs  with  components  from  specific  Windows  NT  DLLs. 

The  flexibility  of  this  plug-and-play  feature  permits  implementation  of  a  new  SM  by  just  replacing  a  set  of 
components  that  realize  a  particular  SM  component  interface,  without  rewriting  the  whole  program.  Writing  a  new 
SM  component  only  requires  the  header  file  with  the  object  interface,  the  executable  program  (common  execution 
framework)  and  its  corresponding  library. 

3.2  The  CPU  Service  Manager 

.  Rx\V=°oCPL:  util  ization=con  Siam 

The  CPU  SM  provides  periodic  applications  access  to  a  processor  resource.  Each  Rate 
computing  node  has  a  CPU  SM,  allowing  concurrent  applications  to  share  a  CPU.  The 
application  QoS  is  bi-dimensional:  application  execution  rate  (R)  and  iteration 
execution  time  (W)  (Figure  4).  The  COP  (Current  Operational  Point)  represents  the 

Workload 

current  values  for  the  multidimensional  QoS.  „ 

Figure  4.  CPU  SM  QoS 

Admission  and  Adaptation 

The  specific  CPU  scheduling  policy  is  isolated  within  the  Scheduler  object  and  the  Monitor  keeps  track  of 
application  CPU  utilization.  The  invariant  condition  for  admission  and  schedulability  for  n  applications  is 
Ii=i..nRiW,  <  100%  processor  utilization.  A  more  sophisticated  CPU  SM  can  be  implemented  at  any  time,  by  just 
using  the  plug-and-play  feature,  replacing  the  default  Scheduler  component  with  one  specific  to  the  scheduling 
discipline  used. 

The  CPU  SM  service  allocation  unit  for  each  periodic  application  is  the  fraction  of  CPU  utilization  (R  x  W).  The 
CPU  SM  communicates  this  information  to  applications  and  assumes  they  are  well  behaved  and  keep  their  process 
utilization  below  the  allocated  limits.  The  SM  scheduler  only  assigns  application  rates  and  does  not  control  the 
underlying  OS  scheduler.  This  policy  works  fine  on  a  larger  time  scale  and  for  our  experimental  purposes.  For  real¬ 
time  performance  one  solution  is  to  implement  a  soft  real-time  CPU  scheduling  server  above  the  OS  scheduler  [9], 
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Commercial  operating  systems  with  soft  real-time  capabilities,  like  Windows  NT  and  Solaris,  limit  the  scheduler 
granularity  to  10-20ms. 

The  CPU  SM  implements  the  Ripple  Scheduling  admission  protocol.  Because  it  is  at  the  bottom  of  the  SM 
hierarchy  and  has  no  LSMs,  it  does  not  make  any  other  recursive  calls.  Adaptation  and  collateral  adaptation 
(Sections  2.2,  2.3)  reduce  the  application  rate  to  the  minimum  contracted  value.  QoS  expansion  increases  the 
application  contracted  QoS  (rate)  to  the  best  available  value. 

Feedback  Adaptation 

The  CPU  SM  controls  the  task  rate  in  real-time.  It  cannot  change  the  workload,  which  is  left  exclusively  under 
application  control.  Applications  send  their  current  QoS  operational  point  as  events  to  the  CPU  SM  monitor  at  the 
end  of  each  periodic  iteration.  At  any  moment,  the  QoS  COP  may  vary  so  that/?  xW<L,  where  L  is  the  fraction  of 
the  contracted  processor  utilization.  The  CPU  SM  adjusts  the  COP  as  follows:  (1)  increase  rate  when  workload 
decreases:  (2)  decrease  rate  on  overload,  when  the  workload  pushes  the  COP  outside  the  contracted  region. 

3.3  The  Network  Service  Manager 

We  integrated  the  NetEx  real-time  network  management  system  [2,1 1]  from  Texas  A&M  University  into  the 
RTARM  system.  NetEx  runs  as  a  middleware  and  provides  connection-oriented  real-time  communication  with 
guaranteed  delay  and  bandwidth  over  COTS  network  infrastructure,  such  as  ATM  and  switched  10/100  Mbps 
Ethernet.  NetEx  uses  a  tri-dimensional  QoS:  period,  delay  and  message  size 
and  adds  the  connection  source  and  destination  network  addresses  to  the 
connection  contract.  The  NetEx  resource  management  interface  is,  however, 
incompatible  with  the  RTARM  interfaces.  It  has  different  semantics  and  it 
does  not  export  the  two-phase  commit  protocol.  We  built  an  object-oriented 
wrapper  [3]  around  NetEx  that  hides  the  incompatibilities  and  exports  the 
RTARM  interface  to  clients,  applications  and  HSMs  (Figure  5).  The  wrapper  i  Allocator  Scheduler 
method  can  be  applied  to  integrate  any  service  provider  in  the  RTARM  pjgUre  5  jhe  NetEx  Object  Wrapper 
architecture. 

The  wrapper  implements  three  SM  components,  Negotiator,  Adapter  and  Enactor,  that  map  the  RTARM 
interface  calls  for  admission,  adaptation  and  expansion  to  the  native  NetEx  API.  NetEx  does  not  providefeedback 
adaptation  for  connections,  so  the  wrapper  SM  does  not  implement  feedback  adaptation  either.  It  is  important  to 
note,  however,  that  our  HSM  for  integrated  services  for  parallel  pipeline  applications  implements  hierarchical 
feedback  adaptation.  This  is  detailed  in  the  next  section  3.4. 

3.4  The  Higher-level  Serv  ice  Manager  for  Integrated  Services 

Within  the  RTARM  service  manager  hierarchy,  HSMs  aggregate  services  from  LSMs  (CPU,  Network  or  any 
other  type  of  SM)  and  provide  RTARM  services  to  applications  that  need  a  more  complex  QoS  representation.  The 
unified  resource  model  enables  recursive  deployment  of  HSMs.  Our  HSM  implementation  is  generic  and  is  able  to 
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support  various  types  of  distributed  applications  with  arbitrary  QoS  representations  that  map  to  available  LSM 
QoS.  The  only  restriction  is  that  the  Ripple  Scheduling  admission  and  adaptation  procedure  and  the  hierarchical 
feedback  adaptation  must  not  contradict  the  applications  semantics.  The  QoS  Translator  SM  component  inside  an 
HSM  is  responsible  for  translating  a  QoS  request  into  something  the  LSMs  understand.  Replacing  the  translator 
component  with  a  different  one  (for  a  different  QoS  representation)  produces  a  HSM  capable  of  supporting 
different  integrated  services. 

Admission  and  Adaptation 

Figure  6  shows  the  pseudo-code  for  the  recursive  two-phase  admission  protocol  that  runs  at  the  heart  of  each 
HSM: 


tes preservation (reqQos,  avQos,  candidates,  adaptedApps)  { 

translate  reqQos  into:  LS  -  list  of  requested  services  from  LSMs,  and 

LreqQos  -  corresponding  QoS  per  service, 
for  each  service  S  from  LS  { 

for  each  LSM  Ism  that  provides  service  S  { 

success  -  lsm“>test_reservation (LreqQoS [Ism] ,  IsmAvQosfS], 

candidates  that  run  on  Ism,  lsmAdaptedApps [S] ) 
if  success  then  mark  admitted  service  and  continue  with  next  service  S  from  LS 

} 

if  service  S  was  not  admitted  then  { 

cancel  all  previous  successful  admissions  and 

return  false 


//  now  all  services  from  LS  have  been  admitted 

reverse-translate  and  maximize  the  returned  QoS  from  IsmAvQos  into  avQoS 
perform  collateral  adaptation  if  necessary 

return  true 

} 

conmxi preservation  (commit tedQos ,  adaptedApps)  { 
translate  commitedQos  into: 

Llsm  -  list  of  LSMs  and 

LcommittedQos  -  committed  QoS  per  service 
for  each  Ism  from  Llsm  { 

lsm->commi preservation (LcommitedQos [Ism] ,  adaptedApps  that  run  on  Ism) 

} 

save  committedQos  into  the  application  contract 

} 

Figure  6.  Pseudo-code  for  the  two-phase  commit  admission  protocol 

The  cancel_reservation  ( )  call  is  similar  to  commit_reservation  ( )  and  is  Omitted  here. 

Figure  7  illustrates  examples  of  admission  of  a  new  application  with  id  3  at  an  HSM  //that  has  3  LSMs,  L\,L2,L2. 

Applications  1  and  2  are  already  running  at  H  and  use  services  from  Lu  l2,  L2.  For  example,  application  I 
(denoted  with  I  at  H)  runs  also  at  L\  (1.1),  at  L2  (1.2)  and  L2  (denoted  1.3).  The  new  application  3  requires  two 
services  and  maps  to  3.1  and  3.2.  In  example  a)  both  3.1  and  3.2  are  admitted  at i]  and  l2  Admission  for  3.1  needs 
adaptation  of  application  1.1  on  L\.  This  triggers  collateral  adaptations  for  1.2  as  well  as  1.3,  as  the  entire 
application  1  must  be  adapted.  Calls  4  and  5  (test_adapt)  askZ.2and  £3  to  adapt  collaterally  application  1.  During 
the  execution  of  commit_reservation  on  H  (call  number  6),  the  collateral  adaptation  of  1  is  committed  oni]  and 
L2  with  the  two  commit_reservation  calls  plus  the  extra  commit_adapt  call  (9)  to  L2.  Example  b)  shows  the  call 
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sequence  when  application  3  is  accepted  by  L\,  but  rejected  both  by  L2  and  I3-  HSM  H  finally  rejects  3  and  returns 
false  to  the  test_reservation  call  1 . 


l-test_reservation(3)\  /  6-comimt  _rc$erv«tion(3) 

RTARM  calls: 

1,2,3  -  test_reservation 
4,5  -  test_adapt 

6,7,8  -  commi preservation  7  y 
9  -  commipadapt 


a)  Successful  admission  of  application  3 


l-test  reservation^) 


RTARM  calls: 

1,2,3 ,4  -  tespreservation 
5  -  cancel  reservation 


3.2  not  admitted 

b)  Failed  admission  of  application  3 


Figure  7.  Examples  of  the  admission  protocol  sequence  executed  at  an  HSM 


We  have  implemented  a  Pipeline  Service  Manager  (PSM).  an  HSM  that  aggregates  services  from  lower-level 
SMs  (CPU,  Network,  other  HSMs)  into  a  higher-level  integrated  representation  suited  for  pipeline  applications.  Our 
PSM  supports  periodic  independent  tasks  and  periodic  parallel  pipeline  applications,  consisting  of  communicating 
stages  in  an  arbitrary  configuration,  with  a  single  source  and  a  single  sink  node. 

We  assume  a  sensor  enters  periodically  data  frames  in  the  pipeline.  Each  frame 
is  processed  by  a  stage  or  a  composite  stage  [1]  (consisting  of  parallel  strings  of 
elementary  stages)  and  then  sent  to  the  next  stage.  Such  a  pipeline  application  is 
depicted  in  Figure  8. 

For  periodic  pipeline  applications,  we  use  a  QoS  consisting  of  end-to-end 
message  latency  and  rate  for  the  final  stage.  The  admission  contract  also  contains  execution  time  for  each  stage  as 
well  as  the  message  size  for  each  inter-stage  connection.  It  is  the  job  of  the  pipeline  translator  to  decompose  the 
integrated-service  pipeline  request  into  CPU  and  network  admission  requests.  We  assume  all  stages  use  the  same 
range  for  rate.  The  pipeline  QoS  (end-to-end  latency,  frame  rate  plus  state  workloads  and  message  sizes)  translates 
into  CPU  QoS  parameters  for  all  stages  and  Network  QoS  for  all  network  connections.  The  CPU  QoS  rate  range  is 
the  same  as  that  for  the  pipeline  frame  rate.  The  pipeline  translator  uses  the  same  rate  range  and  a  fraction  of  the 
end-to-end  pipeline  latency  to  generate  the  Network  QoS  parameters. 


Figure  8.  Parallel 
pipeline 


Hierarchical  Feedback  Adaptation  for  Parallel  Data-Flow  Applications 

We  have  implemented  an  innovative  and  efficient  hierarchical  feedback  adaptation  mechanism  for  parallel 
pipeline  applications  [1],  It  performs  feedback  adaptation  at  two  levels  in  the  SM  hierarchy.  The  pipeline  end-to- 
end  latency  is  controlled  at  the  HSM  level  while  the  CPU  SMs  perform  CPU  feedback  adaptation  independent  of 

the  HSM. 

The  pipeline  QoS  parameter  we  consider  critical  and  want  to  control  is  the  end-to-end  latency.  As  the  pipeline 
evolves  in  time,  rates  of  intermediate  stages  may  change  as  a  result  of  CPU  SM  feedback  adaptation.  In  normal 
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circumstances,  the  input  sensor  period  is  maintained  at  a  value  greater  than  the  current  period  of  any  stage/substage 
of  the  parallel  pipeline  application,  but  it  can  get  lower  because  of  independent  CPU  feedback  adaptation.  When 
accumulation  of  queues  between  stages  increases  the  end-to-end  latency  beyond  a  maximum  threshold,  the  PSM 
sets  the  input  sensor  period  at  the  maximum  value  from  the  pipeline  contract.  A  finite  state  machine  in  the  PSM 
maintains  this  maximal  period  for  a  fixed  time,  allowing  the  queues  to  empty.  Then,  the  PSM  sets  again  the  input 
sensor  period  to  the  maximal  current  period  of  all  stages,  typically  lower  than  maximum  period  from  the  contract. 
We  have  proved  in  [1]  that  the  end-to-end  latency  decreases,  and  that  after  a  finite  number  of  frames  the  pipeline 
enters  a  region  of  stability  where  the  end-to-end  latency  and  the  output  frame  rate  are  within  the  contracted  region. 

This  method  is  simple  and  efficient,  as  the  only  parameter  to  be  adjusted  is  the  sensor  input  period,  while  the 
pipeline  stages  are  controlled  only  by  the  corresponding  CPU  SM.  This  mechanism  avoids  costly  communication 
and  coordination  between  the  HSM  and  all  the  CPU  SMs.  The  information  required  for  pipeline  feedback 
adaptation  is  minimal:  the  end-to-end  latency  for  the  current  frame  and  the  maximal  current  period  of  all  stages. 

We  present  in  the  next  section  experiments  with  synthetic  pipeline  applications  and  an  Automatic  Target 
Recognition  application  to  estimate  the  performances  of  the  RTARM  system. 


4.  Experiments  and  Performance  Evaluations 

To  evaluate  the  RTARM  system  we  designed  two  experiments.  The  first  deals  with  synthetic  pipeline 
applications  and  yields  performance  numbers  for  admission,  adaptation  and  QoS  expansion  for  the  CPU,  Network 
and  Pipeline  SMs.  The  second  experiment  tests  feedback  adaptation  for  parallel  pipeline  applications.  The  Forward 
Looking  Infrared  Automatic  Target  Recognition  application  provided  an  idealtestbed  to  prove  the  efficiency  of  our 
hierarchical  feedback  adaptation  technique. 

The  runtime  environment  for  these  experiments  consists  of  three  450MHz  Dell  Workstation-400  machines, 
running  Windows  NT,  connected  via  a  Fore  ATM  switch  with  OC-3c  (155Mbps)  links.  Each  machine  hosts  a  CPU 
SM.  Both  the  network  SM  that  controls  the  inter-stage  communication  and  the  pipeline  SM  run  on  one  of  the  three 
machines.  We  consider  their  own  CPU  resource  consumption  negligible.  All  inter-SM  CORBA  communication 
uses  a  secondary  Fast  Ethernet  network,  so  the  ATM  lines  remain  100%  available.  We  used  the  NT  performance 
counter  for  precise  measurements. 

4.1  Performance  for  Admission  and  Adaptation 

For  evaluating  admission,  adaptation  and  expansion  for  pipeline  applications  we  devised  two  scenarios. 

Scenario  1. 

We  tested  admission  of  three-stage  pipelines  on  a  SM  hierarchy  with  one  HSM  (P),  one  NSM  (N)  and  two  CPU 
SMs  (Ci.  C2),  as  illustrated  in  Figure  9.  The  sequence  of  events  is: 

1 .  admit  pipeline  1 ;  no  adaptation  required. 
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2.  admit  pipeline  2  with  higher  criticality;  stage  1 . 1  is  adapted  due  to  CPU  constraints  on  SM  C 1 ;  stages  1 .2,  1 .3 
and  network  connections  are  adapted  collaterally. 

3.  terminate  pipeline  2;  pipeline  1  is  expanded  back  to  its  original  QoS  (all  stages  and 

the  network  connections). 

4.  try  admission  for  pipeline  3  with  lower  criticality  than  1;  not  enough  CPU 
resources,  admission  is  denied. 

5.  terminate  pipeline  1. 

Scenario  2.  ...  ,  .  .. 

Runs  on  the  same  environment  as  Scenario  1  and  is  similar,  except  the  pipelines  now 

have  two  stages  and  adaptation  is  caused  only  by  network  bandwidth  constraints,  not 

by  CPU  resource  insufficiency. 

Throughout  the  tests  we  measured  the  time  to  complete  the  RTARM  interface  calls  for  admission,  adaptation 
and  expansion  for  the  CPU,  Network  and  Pipeline  SM.  The  measured  time  consists  of  the  actual  processing 
overhead  and  time  to  complete  nested  calls  to:  (1)  application  CORBA  servers  for  the  CPU  SM;  (2)  the  NetEx 
management  subsystem  and  application  CORBA  servers  for  the  Network  SM  (NetEx  wrapper)  and  (3)  LSMs  for 

the  Pipeline  SM. 

The  performance  measurements  for  the  Pipeline  SM  are  listed  in  Table  1,  for  the  CPU  SM  in  Table  2  and  for  the 
Network  SM  in  Table  3.  All  values  are  expressed  in  milliseconds. 


Client 


Figure  9.  Scenario  1 
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Table  1 :  Measurements  for  PSM  Table  2:  Measurements  for  CPU  SM 

For  the  PSM  the  'Total  Time”  columns  include  the  sequence  of  recursive  RTARM  CORBA  calls  to  the  LSMs  and 
the  algorithm  processing  overhead.  Some  calls  may  require  adaptation  of  lower  criticality  applications,  such  as 
test_reservation  ( )  at  step  2  in  scenario  1;  other  calls,  like  the  expansion  operations,  are  100%  with  adaptation. 
From  Table  1  we  notice  that  the  reservation  operations  andend_aPP()  require  extra  processing  work  if  adaptation 
is  involved.  Also  the  processing  time  for  test_reservation()  is  considerably  larger  than  all  other  calls  since  it 
involves  back-and-forth  QoS  translation  and  reverse-translation.  But  what  stands  out  is  the  large  total  time 
consumed  for  commit_reServation()  for  a  three  stage  pipeline  application,  approximately  2.3  seconds.  This  time 
includes  the  duration  for  ^preservation  ( )  calls  to  the  CPU  SM  that  take  more  than  500ms  for  each  pipeline 
stage  (see  Table  2).  A  CPU  commit.reservationO  call  actually  generates  a  set_qos  ()  call  with  the  committed 
application  QoS  to  the  application  stage  CORBA  server.  The  stages  are  not  up  and  running  when  admission 
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happens.  The  Orbix  daemon  [7]  starts  the  stage  process  and  passes  the  COREA  server  HOP  TCP  port  number  and 
IP  address  to  the  CPU  SM.  Only  after  the  stage  is  up  and  initialized  it  is  able  to  respond  to  theset_qos  ( )  CORBA 
call  from  the  CPU  SM.  The  time  to  start  a  Windows  GUI  application  (the  pipeline  stage)  on  Windows  NT  4.0  is 
around  half  a  second  for  our  test  configuration. 


w/o  Adaptation 

with  Adaptation  ; 

tes  preservation 

commit_reservation 

testadapt 

testexpansion 

commitexpansion 

end_app 

21473 - rm - C - T9J2S - 48414 - JWi  0  443T3 

45.434  0.637  44.797  0  49.962  1.105  48.857  0 

0.056  0.056  0  0 

33.093  0.355  0  32.738 

0.697  0.697  0  0 

10.08  0.289  0  9.791 

Table  3:  Measurements  for  Network  SM 


Table  3  shows  time  measurements  for  the  Nework  SM  These  are  more  complex  since  the  NetEx  wrapper 
communicates  through  TCP/IP  with  the  NetEx  Host  Traffic  Manager  [2.1 1]  and  stages  through set_qos  ( )  CORBA 
calls  (only  during  commit_reservation  ( ) ).  The  communication  latency  overhead  caused  by  NetEx  is  comparable 
to  CORBA  communication  overhead,  between  10  and  45ms. 

We  conclude  that  operation  of  the  RTARM  system  is  efficient,  except  thecommit_reservation ( )  call  for  CPU 
applications.  This  major  delay  can  be  completely  avoided  by  pre-loading  the  applications  before  the  client  submits 
the  pipeline  contract  to  the  HSM.  The  overall  system  performance  may  further  improve  by  using  a  faster  CORBA 
implementation  that  guarantees  real-time  operation  deadlines. 

4.2  Performance  for  Hierarchical  Feedback  Adaptation 
The  Automatic  Target  Recognition  Experiment 

We  tested  the  RTARM  feedback  adaptation  mechanism  on  a  true  mission-critical  application.  The  ATR 
application,  schematically  shown  in  Figure  10,  processes  video  frames  captured  by  a  camera,  and  displays 
recognized  targets  on  a  display.  Stage  0  (the  sensor)  generates  frames  that  are  passed  through  a  series  of  filters  and 
processing  elements  up  to  stage  6,  which  displays  the 
original  image  and  the  identified  targets.  The  frames  are  8- 
bit,  360x360  pixels,  monochrome  images,  and  contain  a 
variable  number  of  targets  (from  3  to  50),  depending  on  the 
frame.  Stages  4,  5  and  6  expose  a  variable  workload, 
proportional  to  the  number  of  targets,  that  without  feedback 
adaptation  would  cause  queue  accumulations  with  negative 
effect  on  the  end-to-end  frame  latency. 


End-to-End  Latency 


Frame  Arrival  Period 


Figure  10.  ATR  pipeline  application  and  QoS 
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Performance  Metrics  and  Evaluation 

The  ATR  pipeline  contract  requires  an  acceptable  output  frame  period  interval  of  [1,5]  s,  and  a  frame  latency  of 
0.7-13  s.  The  seven  ATR  stages  ran  at  a  variable  workload  between  0.02  and  1.5s  and  within  the  same  period 

interval  [1,5]  s. 

We  first  present  timing  measurements  for  the  feedback  adaptation  a,  the  CPU  SM  and  PSM  SM  level  (Figure 
1 1)  We  measured  the  processing  overhead  of  the  feedback  adaptation  code  (part  2  in  Figure  1 1)  and  the  time  it 
rakes  the  SM  to  react  from  the  moment  it  receives  the  current  QoS  from  the  application  until  its  adaptatton 

command  is  enforced  (part  2  +  part  3). 


Figure  1 1 .  Feedback  adaptation  performance 
measurements. 

The  measured  times  are  displayed  in  Table  4.  For  the  CPU  feedback  adaptation,  detection  and  enforemg  the 
QoS  adaptation  takes  around  4.4ms.  Most  of  the  time,  3.9ms,  is  spent  in  a  set_qos  ( )  operation,  a  two-way 
normal,  local  CORBA  call.  The  pipeline  adaptation  enforcement  includes  a  set.qos  0  call  to  the  CPU  SM  that 
controls  the  sensor  (or  first  stage)  that  calls  directly  the  first  stage  with  a  set_qos  <  >  call.  This  explains  why 
enacting  pipeline  QoS  adaptation  takes  almost  double  than  for  CPU  SM  QoS. 


Detection  and  decision  processing  (2) 

Decision  Enactment  (3) 

Total  Time  (2+3) 

CPU  SM 

Pipeline  SM 

0.508  ms 

0.859  ms 

3.914  ms 

6.816  ms 

4.422  ms 

7.675  ms 

Table  4.  Feedback  adaptation  performance  results  for  CPU  SM  and  PSM 
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Figure  12  displays  CPU  feedback  adaptation  for  stage  4  in  the  ATR  pipeline.  The  stage  has  variable  workload  that 


CPU  Feedback  Adaptation 


»  Wofl>to»d  — —  R»t«  CPU  LttatfsRite  «  Workload 

Figure  12.  CPU  SM  feedback  adaptation  for  a  task  with  variable  workload 

triggers  its  CPU  SM  to  change  its  rate.  Points  A  indicate  overload  that  triggers  rate  decrease  and  points  B  indicate 
chronic  underutilization  that  determines  rate  increase. 


_ Sensor  Input  Period 

—  End-to-end  Latency 
— ,  Threshold 


Figure  13.  Latency  Variation  for  ATR  with  and  without  pipeline  feedback  adaptation 


While  running  the  ATR  application,  the  pipeline  feedback  adaptation  mechanism  makes  sure  the  end-to-end 
latency  and  rate  stay  in  the  contracted  range  (Figure  13).  In  order  to  practically  demonstrate  its  effectiveness,  we 
disabled  the  pipeline  feedback  adaptation  after  some  time  while  keeping  the  sensor  input  period  at  a  sustained  low 
value  of  1.48s  (0.67Hz).  This  caused  accumulation  of  frames  in  stage  queues  that  translated  into  an  increasing  end- 
to-end  frame  latency.  While  feedback  adaptation  was  disabled  we  actually  did  not  get  latency  measurements,  so  we 
drew  a  dotted  line  between  points  A  and  B.  When  the  latency  reached  30s,  way  above  the  contracted  value,  we  re¬ 
enabled  pipeline  feedback  adaptation.  Immediately  the  PSM  sensor  increased  the  sensor  input  period  up  to  5s.  The 
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latency  went  rapidly  down  (B  ->  C),  below  the  threshold,  after  a  brief  spike  caused  by  the  inertia  of  the  more  than 
23  frames  already  in  transit  through  the  pipeline. 

Our  hierarchical  feedback  adaptation  algorithm  proved  to  be  effective  and  efficient.  Detection,  decision  and 
enforcement  take  less  than  8ms  and  involve  only  the  CPU  SMs  for  the  sensor  stage  and  the  last  stage  that  actually 
reports  the  latency  and  rate. 


5.  Conclusions 

This  paper  presents  the  middleware  architecture  and  implementation  of  the  RTARM  system.  We  have  focused 
on  the  architectural  elements  that  enable  RTARM  support  for  integrated  services: 

•  the  uniform  service  management  recursive  hierarchy  and  protocols 

•  the  common  architecture  of  a  service  manager  that  facilitates  rapid  00  prototyping,  massive  code  reuse  and 
features  plug-and-play  support  for  SM  components. 

Then  we  detailed  the  specific  service  managers  (CPU,  Network  and  Pipeline  SM)  that  constitute  the  RTARM 
hierarchy.  Finally,  we  presented  experiments  that  illustrate  the  practical  use  of  the  RTARM  system  and  its 
effectiveness  for  a  real-world  Automatic  Target  Recognition  application.  We  demonstrated  that  our  hierarchical 
feedback  adaptation  mechanism  is  able  to  efficiently  control  in  real  time  the  dynamic  behavior  of  parallel  pipeline 
distributed  applications. 

The  clean  and  flexible  architecture  of  a  SM  allowed  us  to  integrate  quickly  a  new  service  provider  in  the 
RTARM  hierarchy.  We  built  an  object  wrapper  around  the  incompatible  interface  of  the  NetEx  network 
management  system  that  provided  the  same  CORBA  interface  implemented  by  all  RTARM  service  managers. 

We  plan  to  port  RTARM  to  a  real-time  CORBA  implementation,  such  asWUStL  TAO  [12]  and  to  optimize  its 
performance.  We  also  intend  to  develop  more  sophisticated  hierarchical  feedback  adaptation  mechanisms  with 
prediction  features  which  would  further  decrease  the  system  reaction  time. 
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Abstract 

This  paper  presems  the  Real  Time  Ariaprive  Resource  Management  (RTARM1,  system,  a  middleware 
archi.ec, are  for  real-time  adaptive  resource  management  with  supper,  for  integrated  services,  developed  by  the 
Honeywell  Technology  Center.  This  system  is  designed  for  distributed  computing  envrronments  where  mrssron- 
critical  applications  must  be  able  to  adapt  to  miss, on  dependent  varia.rons  in  resource  demands  as  well  as  dynamic 
changes  in  resource  availabiliry.  We  describe  the  distributed  hierarchical  object-oriented  archrtecur.  of  RTARM, 
its  flexibility  and  we  focus  on  the  issue  of  feedback  adaptation  in  the  RTARM  system.  Feedback  adaptatron  ,s 
responsible  ’for  maintaining  the  application  QoS  parameters  within  the  acceptable  region  and  provides  corrective 
actions  triggered  by  srgnifrean,  events.  The  main  contribution  of  this  paper  ,s  a  hierarchical  feedback  adaptat, on 
method  .ha,  efficiently  controls  the  dynamic  QoS  behavior  of  distributed  data-flow  applicanons,  such  as  sensor- 
based  data  streams  or  mission-critical  command  and  control  applicatrons.  The  method  works  independently  a,  two 
leve|s  i„  the  RTARM  hierarchy,  a,  the  distributed  application  level  and  a,  the  CPU  resource  level.  Ou,  approach  ,s 
simple  and  efficient.  There  is  only  one  parameter  .ha,  controls  the  application  QoS  a,  the  distributed  app  , canon 
level  Independently,  the  CPU  service  management  level  performs  feedback  adaptation  to  keep  processor  utrUrat, on 
within  acceptable  ranges.  We  present  the  analytical  model  for  feedback  adaptation  applied  ,0  perrod.c  d,s,r,bu,ed 
data-flow  applications.  We  also  describe  experimental  results  for  an  Automatic  Targe,  Recognrnon  drsmbuted 
application  and  the  impact  of  hierarchical  feedback  adaptation  on  the  application  behavior  and  „s  QoS  parameters. 

Key  words:  hierarchical  feedback  adaptation,  distributed  resource  management,  real-time  applications,  QoS 
negotiation  and  adaptation. 
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1.  Introduction 


Large,  time-critical,  distributed  systems,  such  as  defense  computing  environments,  usually  host  a  mix  of 
application  types  that  share  common  communication  and  processing  resources.  These  applications  exhibit  ahigh 
degree  of  variability  in  performance  requirements,  criticality  and  demand  fault  tolerance  and  reliability.  Building 
such  a  system  using  Common  Of  The  Shelf  (COTS)  components  is  a  challenge.  In  order  to  keep  complexity  under 
control,  there  is  a  definite  need  for  an  application  Quality  of  Service-aware  resource  management  system. 

In  recent  years,  there  have  been  several  efforts  to  build  adaptive  resource  management  systems  for 
heterogeneous  resources  with  real-time  constraints  [2,3,4,8,9,11].  This  paper  presents  developments  of  the  Real 
Time  Adaptive  Resource  Management  (RTARM)  system  [2],  designed  by  the  Honeywell  Technology  Center.  The 
goal  of  the  RTARM  project  is  to  develop  a  hierarchical  real-time  adaptive  resource  management  system, 
implemented  as  middleware  on  COTS  components  and  to  apply  it  to  mission-critical  distributed  applications. 

The  RTARM  system  defines  a  hierarchical  resource  management  architecture  that  provides  the  following  basic 
services  [2]:  (1)  scalable  end-to-end  criticality-based  Quality  of  Service  (QoS)  contract  negotiation  that  allow 
distributed  applications  to  share  common  resources  while  maximizing  their  utilization  and  execution  quality; 
(2)  end-to-end  QoS  adaptation  that  dynamically  adjust  application  resource  utilization  according  to  their 
availability  while  optimizing  application  QoS;  (3) integrated  services  for  CPU  and  network  resources  with  end-to- 
end  QoS  guarantees  and  (4)  real-time  application  QoS  monitoring  for  integrated  services.  An  innovative  feature  of 
RTARM  is  the  hierarchical  resource  management  architecture  that  unifies  heterogeneous  resources  and  their 
management  functions  into  a  uniform  abstract  resource  model.  In  this  paper,  we  refer  to  services  and  resources 
interchangeably.  The  central  piece  of  the  architecture  is  the  Service  Manager,  a  recursive  structural  component. 
This  encapsulates  a  set  of  services  and  their  management  functions.  Because  all  service  managers  export  the  same 
common  interface,  it  becomes  easy  to  build  layered  hierarchies  recursively,  in  which  heterogeneous  services  are 
integrated  bottom-up.  This  also  helps  rapid  object-oriented  prototyping  and  development. 

The  RTARM  approach  facilitates:  (1)  provision  of  integrated  services  and  end-to-end  adaptive  QoS 
management.  (2)  easy  extensibility  to  offer  new  service  types,  (3)  design  flexibility  that  provides  affordable  plug- 
and-play  for  architecture  components  as  well  as  third-party  service  providers. 

Many  mission-critical  distributed  command  and  control  applications,  such  as  Automatic  Target  Recognition 
(ATR)  [5],  exhibit  a  degree  of  flexibility:  they  tolerate  a  range  of  QoS  and  resource  usage  above  a  minimum  limit. 
Their  performance  depends  on  the  allocated  resources  and  they  are  ready  to  trade  off  some  application  service 
quality  to  save  the  critical  services.  For  these  applications,  it  is  important  to  have  a  mechanism  that  regulates  their 


72 


dynamic  behavior  and  protects  them  from  contract  violations.  The  main  contribution  of  this  paper  .s  a  new 
hierarchical  QoS-based  real-time  feedback  adaptation  method  for  distributed  periodic  data-flow  applications  w.th 
parallel-pipeline  structure.  We  have  developed  an  analytical  model  that  enables  control  of  the  end-to-end  QoS 
behavior  for  the  entire  distributed  application  by  adjusting  the  input  rate  in  the  pipeline.  This  model  can  be 
generally  applied  to  any  type  of  application  with  data-flow  pipeline  structure  and  a  compatible  QoS  representat.on, 
such  as  multimedia  streams  and  distributed  command  and  control  applications.  We  applied  this  model  of  feedback 
adaptation  to  our  RTARM  integrated  service  provider  and  experimented  with  a  distributed  ATR  application. 

Related  work 

Other  adaptive  real-time  resource  management  systems  are  GRMS  [3,4],  ARA  [9,10]  and  QualMan  [8].  GRMS 
has  a  hierarchical  structure  that  reflects  the  application  data  flow  and  does  not  offer  feedback  adaptation.  The  ARA 
framework  [10]  provides  feedback  adaptation  for  applications  having  a  discrete  set  of  acceptable  configurations 
with  specific  resource  needs.  ARA  accomplishes  feedback  adaptation  by  resource  reallocation.  [7]  proposes  a 
feedback  adaptation  method  that  adjusts  the  rate  of  data  sent  from  a  server  to  clients  based  on  observation  and 
prediction  using  a  control-theoretical  model.  The  system  described  in  [6]  uses  digital  control  theory  to  determine 
the  states  of  the  adaptive  system,  which  may  activate  control  algorithms  for  adaptation.  Another  adaptive  resource 
management  system  is  QualMan  [8],  designed  for  distributed  multimedia  applications. 

Our  work  differs  from  these  approaches  at  the  resource  management  architecture  level,  by  supporting  other 

application  paradigms  or  by  the  way  it  accomplishes  feedback  adaptation. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section  2  we  briefly  describe  the  object  oriented  hierarchical 
architecture  of  the  RTARM  system,  its  interfaces  and  several  service  managers.  Section  3  presents  the  feedback 
adaptation  model  and  analysis  for  the  periodic  parallel-pipeline  applications.  Section  4  continues  with  the 
description  of  the  hierarchical  feedback  adaptation  in  RTARM,  the  ATR  experiment,  performance  metrics  and 
evaluation  and  the  impact  of  feedback  adaptation  on  the  ATR  QoS.  Section  5  concludes  the  paper  and  presents 

directions  for  future  work. 


2.  The  Real  Time  Adaptive  Resource  Management  Architecture 

We  have  implemented  an  RTARM  prototype  that  supports  periodic  independent  tasks  and  periodic  parallel 
pipeline  applications  with  real-time  requirements.  The  RTARM  system  is  built  as  a  middleware  layer  above  the 
operating  system  and  network  resources.  RTARM  allows  service  initiation  requests  (admission  requests)  from 
clients  and  views  applications  as  service  consumers.  When  a  client  requires  a  service  from  a  service  manager  on 
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behalf  of  an  application,  it  negotiates  a  QoS  contract  that  defines  the  allocated  services.  This  contract  may  change 
later  when  the  application  is  adapted. 

The  middleware  approach  provides  the  benefit  of  flexibility  and  portability  but  the  increased  distance  to  the  real 
resources  makes  fine-grained  control  difficult.  RTARM  supports  a  multidimensional  representation  of  QoS,  defined 
by  a  set  of  parameters  (e.g.  rate,  latency,  jitter)  specified  as  a  range  [QoSmin,  QoSmax].  The  RTARM  system  strives 
to  allocate  the  best  available  services  to  applications  with  priority  for  ones  that  are  more  critical. 

2.1  Hierarchical  Adaptive  Service  Management  for  Integrated  Services 

The  basic  block  of  the  RTARM  recursive  service  manager  hierarchy  is  the  Service  Manager  (SM).  It 
encapsulates  a  set  of  services  and  their  management  mechanism.  At  the  bottom  of  the  hierarchy  are  SMs  that 
provide  management  functions  for  basic  resources,  such  as  CPU  or  network  resources,  and  directly  control  resource 
utilization  by  application  components.  Higher  level  services  may  be  built  on  top  of  lower-level  services,  giving  rise 
to  a  service  hierarchy.  One  use  of  a  service  hierarchy  is  to  provide  abstract  or 
integrated  resources  for  clients. 

HSM 

Figure  1  depicts  a  simple  runtime  configuration  with  two  different  Lower-  integrated 

Service 

level  SMs  (LSM),  a  CPU  and  a  Network  SM,  at  the  bottom  of  the  hierarchy, 
two  applications  and  two  clients  accessing  services  from  two  Higher-level 
SMs  (HSM). 

Resources  as  well  as  negotiation  requests  are  treated  uniformly  across  the 

entire  hierarchy.  HSMs  may  act  as  clients  for  lower-level  SMs  that  provide 

services  to  HSMs.  The  hierarchy  allows  dynamic  configuration  as  new  service 

managers  can  be  added  to  the  system  anytime.  Clients  can  directly  access  Figure  1.  Sample  RTARM 

hierarchy 

service  providers  at  any  point  in  the  hierarchy,  depending  on  their 

requirements.  A  request  for  an  integrated  service  sent  to  an  HSM  may  require  resources  from  lower-level  service 
providers.  During  the  application  admission  procedure,  a  virtual  spanning  tree  is  built  over  the  SM  hierarchy  that 
remains  valid  for  the  entire  application  lifetime. 

2.2  Adaptation 

RTARM  recognizes  three  situations  when  application  QoS  may  be  changed  after  admission  [2]:  (la)  QoS 
shrinking/reduction  of  lower  criticality  applications  when  a  new  application  comes;  (lb)  QoS 
expansion/improvement  when  applications  depart  and  release  resources,  and  (2) feedback  adaptation.  While  (la) 
and  (lb)  imply  contract  changes  and  involve  other  applications,  feedback  adaptation  does  not  change  the  contract 
but  only  varies  the  current  operational  point  of  the  application  within  the  contracted  QoS  region.  Feedback 
adaptation  is  triggered  only  by  significant  changes  in  application  behavior,  such  as  resource  overload  that  results  in 
a  lowering  of  QoS  operating  point,  resource  underutilization  that  prompts  RTARM  to  increase  the  application  QoS 
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operating  point  within  the  contracted  QoS  region  and  QoS  contract  violations  that  require  corrective  actions. 
Section  3  and  4  present  feedback  adaptation  in  detail  for  pipeline  applications. 

1  3  The  Service  Manager  Object  Architecture  and  Interface 

The  unified  resource  model  approach  for  RTARM  brings  the  benefits  of  a  uniform  architecture  for  a  W* 
managers  and  a  common  interface  between  them,  implemented  by  us  using  CORBA.  Figure  2  shows  a  stmphfied 

conceptual  model  of  a  service  manager. 


Translator 


Data  Structures: 
Service  Managers 
Application  Contracts 
Monitoring  and  FA  Data 


Scheduler 


Negotiator  FbAdaptor  Detector 


Allocator 


Involved  in 

Feedback 

Adaptation 


rr —  .  i  pwuui 

Involved  in  - ’ 

Admission 

Control  and  Cross-  . . 

App  Adaptation 

Figure  2  Service  Manager  simplified  object  model 

A  service  manager  is  implement  as  a  user-leve,  process  with  components  implemented  as  communicating 
objects.  The  components  in  a  SM  are  as  follows: 

•  Negotiator:  brokers  contract  admission,  delegates  responsibilities  .0  other  components  and  exports  external 
RTARM  CORBA  interface. 

•  Translator:  translates  higher-laye,  integral  QoS  into  lower-layer  QoS  representation. 

•  Allocator:  handles  resource  allocation/release  when  no  adaptation  is  necessary. 

•  Adapter:  handles  resource  allocation/release  with  adaptation  and  QoS  expansion/contraciton. 

•  Scheduler:  determines  whether  allocation  of  resources  and  expansion  of  application  QoS  is  feasible. 

•  Enactor:  enforces  changes  in  application  QoS  or  status. 

.  Moat, or:  keeps  an  eye  on  applications  in  exccutron  and  passes  status  information  and  QoS  usage  .0  the 

Detector.  Exports  external  RTARM  CORBA  interface. 

•  Detector:  uses  application  runtime  informa.ton  (e.g.  curren.  QoS  operational  point)  to  detect  significant 
changes  in  application  operation  (e.g.  over, cad.  underutilization,  contract  violation).  Triggers  Feedba 

Adapter  actions. 
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•  Feedback  Adapter :  decides  corrective  actions  for  applications  when  their  runtime  status  changes  significantly. 
Additional  data  structures  exist  to  hold  information  regarding  application  contracts,  other  service  managers  and 
available  services. 

The  clear  separation  of  functionality'  facilitates  object  reuse  and  flexibility.  RTARM  provides  a  common  object 
oriented  execution  framework  that  allows  users  to  dynamically  load  SM  components  (Scheduler,  Adapter,  ...)  from 
shared  libraries  during  runtime  configuration. 

Each  SM  implements  and  exports  three  interfaces.  (1)  Negotiator:  admission  control,  collateral  adaptation,  QoS 
expansion  and  application  control,  such  as  suspend,  resume  and  end;  (2)  Monitor:  application  monitoring  and  event 
propagation;  (3)  ServiceManager:  service  manager  hierarchy  set  up,  register/deregister  SM. 

RTARM  uses  a  modified  form  of  the  Ripple  Scheduling  admission  protocol  from  GRMS  [3,4].  Admission, 
expansion  and  adaptation  are  transaction-based  two-phase-commit  protocols.  User  clients  are  shielded  from  the 
inter-SM  admission  protocol  implementation  details.  A  simple  call  admit _app()  embeds  the  two  phases  and  gives 
clients  simple  semantics  for  application  admission. 

2.4  Service  Manager  Instances 

We  currently  have  implemented  for  the  RTARM  project  three  service  managers:  CPU,  Network  and  a  higher- 
level,  Pipeline  SM.  All  follow  the  general  SM  internal  architecture  described  in  Section  2.3. 

CPU  Service  Manager 

The  CPU  SM  provides  periodic  applications  access  to  a  processor  resource.  Each  computing  node  has  a  CPU 
SM,  allowing  concurrent  applications  to  share  a  CPU.  The  application  QoS  is  bi-dimensional;  the  two  parameters 
are  application  execution  rate  (R)  and  iteration  execution  time  (W).  The  specific  CPU  scheduling  policy  is  isolated 
within  the  Scheduler  object  and  the  Monitor  keeps  track  of  application  CPU  utilization.  CPU  feedback  adaptation  is 
presented  in  more  detail  in  section  4. 

Network  Service  Manager 

W'e  integrated  the  NetEx  real-time  network  service  manager  [1,11]  from  Texas  A&M  University  into  the 
RTARM  system.  NetEx  runs  as  a  middleware  and  provides  connection-oriented  real-time  communication  with 
guaranteed  delay  and  bandwidth  over  COTS  network  infrastructure,  such  as  ATM  and  switched  10/100  Mbps 
Ethernet.  NetEx  uses  a  tri-dimensional  QoS:  period,  delay  and  message  size  and  adds  the  connection  source  and 
destination  network  addresses  to  the  connection  contract.  The  NetEx  resource  management  interface  is,  however, 
incompatible  with  the  RTARM  interfaces.  It  has  different  semantics  and  it  does  not  export  the  two-phase  commit 
protocol.  We  built  an  object-oriented  wrapper  around  NetEx  that  hides  the  incompatibilities  and  exports  the 
RTARM  interface  to  clients,  applications  and  HSMs. 

Pipeline  Service  Manager 

The  Pipeline  Service  Manager  (PSM)  is  a  higher-level  SM  that  aggregates  services  from  lower-level  SMs  (CPU, 
Network,  other  HSMs)  into  a  higher-level  integrated  representation  suited  for  pipeline  applications.  A  PSM  client 


76 


can  be  a  user  or  another  HSM.  The  QoS  Translator  plays  an  essential  role  inside  a  PSM.  It  translates  a  request  for 

integrated  services  into  individual  requests  dispatched  to  LSMs. 

Our  PSM  supports  periodic  independent  tasks  and  periodic  parallel  pipeline  applications,  consisting  of 


communicating  stages  in  an  arbitrary  configuration,  with  a  single  source  and  a  single  sink  node. 

For  periodic  pipeline  applications,  we  use  a  QoS  consisting  of  end-to-end 
message  latency  and  rate  for  the  final  stage.  The  admission  contract  also  contains 
execution  time  for  each  stage  as  well  as  the  message  size  for  each  connection.  It 
is  the  job  of  the  pipeline  translator  to  decompose  the  integrated-service  pipeline 

request  into  CPU  and  network  admission  requests. 

The  Ripple  Sc heduling  admission  algorithm  [3,4]  fits  well  with  our  hierarchical  recursive  structure.  A  top-level 

admission  request  generates  sequential  recursive  execution  of  the  two-phase  admission  protocol  at  all  intermediate 
layers  in  the  resource  allocation  spanning  tree.  The  PSM  admits  applications  at  the  available  QoS.  The  QoS 
expansion  mechanism  will  provide  later  more  resources  to  higher  criticality  applications  and  will  boost  their  QoS 


Figure  3.  Parallel  pipeline 
application 


operating  point  and  the  contracted  QoS. 

The  PSM  also  provides  hierarchical  feedback  adaptation  (presented  in  section  4)  that  continuously  monitors 
application  QoS  parameters  and  controls  their  resource  utilization,  taking  corrective  actions  if  necessary. 


2.5  RTARM  Flexibilitv  and  Plug-and-Play 

All  service  managers  have  a  similar  internal  architecture  and  each  SM  component  has  the  same  programming 
interface,  regardless  of  the  SM  type  and  the  resources  it  manages.  This  uniformity  permits  a  common  execution 
framework  for  all  SMs.  During  SM  initialization  or  at  runtime,  a  configuration  manager  loads  components  from 
shared  libraries.  These  can  easily  be  replaced  without  recompiling  the  whole  SM.  For  instance,  we  can  get  a  Rate 
Monotonic  Analysis-based  CPU  SM  just  by  replacing  the  scheduler  component.  Our  RTARM  implementation  has  a 
single  executable  program  and  different  SMs  are  instantiated  just  by  loading  RTARM  SM  components  from 
different  shared  libraries.  This  increased  flexibility  allows  quick  prototyping  and  provides  a  plug-and-play  feature 
for  SM  components  developed  by  third  parties. 


2.6  Discussion 

Work  on  the  RTARM  project  is  still  in  progress.  The  two-phase  commit  admission  and  adaptation  protocols 
provide  consistency  and  avoid  the  need  for  sophisticated  synchronization  between  service  managers.  It  also  poses 
some  scalability  problems  with  deeper  SM  hierarchies.  A  deep  SM  hierarchy  potentially  would  slow  the  reaction 
speed  for  feedback  adaptation,  as  application  monitoring  information  has  to  bubble  through  the  hierarchy  up  to  the 
HSM  that  got  the  admission  request  from  the  client. 

The  middleware  approach  itself  brings  extra  performance  penalties.  Direct  control  over  resources  is  difficult, 
and  RTARM  must  rely  on  OS  services  or  other  middleware  intermediate  service  managers.  The  increased 
flexibility,  portability'  and  the  chance  for  rapid  prototyping  make  the  middleware  implementation  a  reasonable 
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compromise.  The  flexible  SM  architecture  makes  the  implementation  of  the  NetEx  wrapper  for  the  network  SM 
easy. 

Our  current  RTARM  implementation  runs  on  Windows  NT  machines  and  uses  CORBA  for  inter-process 
communication.  While  Windows  NT  proved  a  stable  development  environment,  we  had  our  problems  with  its 
coarse-grained  process  scheduler  and  timer  functionality.  On  the  other  hand,  we  found  that  CORBA  fits  well  to  the 
RTARM  architecture.  We  plan  to  port  RTARM  in  the  near  future  to  a  real-time  ORB  and  to  optimize  its 
performance. 


3.  Feedback  Adaptation  Model  and  Analysis  for  Pipeline  Applications 

This  section  presents  a  model  for  periodic  pipeline  applications  and  introduces  an  efficient  and  stable  method  for 
feedback  adaptation.  We  consider  the  end-to-end  latency  as  the  most  critical  QoS  parameter.  The  main  result  is  that 
adjusting  only  the  period  for  the  input  sensor  can  control  the  end-to-end  latency  of  a  pipeline  application. 

A  pipeline  application  consists  of  stage  tasks  that  process  data  sequentially.  We  assume  a  sensor  enters 
periodically  data  frames  in  the  pipeline.  Each  frame  is  processed  by  each  stage  in  turn  and  then  sent  to  the  next 
stage.  A  clock-based  pipeline  assumes  that  each  stage  operation  is  synchronous  and  periodic.  If  a  frame  is  available 
for  processing  at  the  beginning  of  a  period,  the  stage  will  process  and  send  it  to  the  next  stage(s)  in  the  data  flow.  If 
no  frame  is  available  at  the  beginning  of  a  period,  the  stage  will  block  until  the  beginning  of  the  next  period,  when 
it  will  repeat  the  same  cycle. 

Our  model  ignores  the  network  communication  overhead  between  two  stages.  This  assumption  would  not  affect 
the  feedback  adaptation  for  the  Automatic  Target  Recognition  experiment  because  of  the  large  disparity  between 
the  stage  period  ( 1  -5s)  and  communication  latency  (0.05s). 

Our  analysis  assumes  that  the  execution  time  and  period  of  each  stage  are  constant.  These  parameters  may  vary 
as  the  pipeline  application  evolves  in  time,  and  our  analysis  relates  with  a  particular  instance  of  time.  It  says  that  if 
starting  with  that  moment  the  sensor  input  period  is  adjusted  over  some  value,  then,  with  the  currently  set 
parameters,  the  pipeline  latency  exhibit  deterministic  behavior.  In  this  way,  the  analysis  may  be  applied  at  any  time 
instance  for  the  corresponding  parameters. 

Section  3.1  presents  our  main  results  and  an  example  for  the  clock-based  simple  pipeline  and  section  3.2 
generalizes  for  clock-based  pipeline  with  composite  stages. 

We  have  also  analyzed  the  event-driven  pipeline  model,  which  may  be  useful  for  other  types  of  applications. 
This  model  assumes  the  stages  are  aperiodic.  They  may  start  execution  of  a  frame  whenever  it  becomes  available. 
The  results  obtained  for  this  model  are  similar  to  those  of  the  clock-based  model:  the  sensor  input  period  is  the  only 
factor  the  pipeline  application  needs  to  adjust  to  control  the  pipeline  end-to-end  latency.  Due  to  the  space  limitation, 
we  do  not  describe  this  model  here. 


78 


3.1.  Clock-Based  Simple  Pipeline 

This  section  starts  with  the  description  of  the  clock-based  pipeline  application  model,  then  presents  the 
theoretical  results  for  the  control  of  the  end-to-end  pipeline  latency  and  finalizes  with  an  illustrative  example. 

The  simple  pipeline  consists  of  individual  applications  (stages).  Each  stage  receives  a  frame,  processes  it  and 

then  sends  it  to  the  next  stage  in  the  data  flow. 

Consider  a  pipeline  with  N+l  stages: 


T 

sensor  — ► 


stageO  stage  1  stageN 

...  -*o 


Figure  4.  Linear,  simple  pipeline 


Notations: 

N+l  is  total  number  of  stages 

T  is  the  period  at  which  the  sensor  pushes  frames  into  the  pipeline.  It  may  change  over  time,  but  we  assume  it  stays 
constant  starting  with  the  frame  with  which  we  develop  the  analysis. 

C(i)  is  the  execution  (processing)  time  on  stage  i. 

T(i)  is  the  period  of  stage  i,  T(i)  >  C(i). 

W(i.  n)  is  the  waiting  time  for  frame  n,  stage  i.  It  represents  the  time  the  frame  needs  to  wait  before  being  processed 
by  the  stage  i.  It  is  greater  than  0  if  the  stage  i  did  not  finish  processing  the  previous  frame. 

W(i.  n)  >  0 

W(i.  n)  =  max  [  WO,  n-1)  -  Wi-1,  n),  0  ],  where  toul(i.  n)  is  the  time  at  which  stage  i  produces  output  for 
frame  n. 

S(i.  n)  is  the  synchronization  time.  It  is  the  time  the  frame  n  waits  to  synchronize  with  the  beginning  of  the  next 

period,  for  stage  i.  0  <  S(i,  n)  ^  T(i) 

/( i.  n)  is  the  latency  for  frame  n  at  stage  i. 

/( i.  n)  =  C(i)  +  W(i.  n)  +  S(i,  n) 

e(i.  n)  is  the  end-to-end  latency  up  to  and  including  the  stage  i,  for  frame  n. 
e(i,  n)  =  Ij=0..i  Aj-n) 

L(n)  is  the  end-to-end  latency  for  the  whole  pipeline,  for  frame  n. 

L(n)  =  e(N,  n)  =  Ij=o ..n  /G-  n) =  sj=o..N  CO) +  “j=o..n  WO,  n)  +  2j=o..n  SO,  n) 


Definition  1: 

The  pipeline  is  in  the  state  Sk ,  where  0  <  k  <  N,  for  a  frame  x,  if  for  all  i  =  0..k  the  relation  (1 )  is  true. 
e(i,  x)  <  Sj=o..i(  CO)  +  TO) ) 

Observation:  If  a  pipeline  is  in  the  state  Sk,  then  it  is  also  in  states  Sk.|,  Sk.2,  Sk.3. ...,  S0. 
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Definition  2: 


We  define  the  stable  region  for  the  end-to-end  latency  as  the  interval  [  Ej=o..n  C(j),  Zj= o..n(  C(j)  +  T(j) )  ]  . 

We  say  the  pipeline  is  in  the  stable  region  if  its  end-to-end  latency  is  within  that  interval. 

If  a  pipeline  is  in  the  state  SN  for  frame  x  then  it  is  in  the  stable  region,  because: 

2*o..n  C(j)  <  L(x)  <  Zj=0..n(  C0  +  T0  ) 

The  left  limit  for  L(x)  is  evident,  because  L(x)  =  Ij=o..N  C(j)  +  ^j=o..n  W(j,  x)  +  Ij=o..n  S(j,  x)  and  Sj=0..N  W(j,  x)  >  0  , 
£j=0..N  S(j,  x)  >  0. 

From  the  application  point  of  view  it  is  important  the  pipeline  latency  be  limited  by  an  upper  bound,  because 
this  guarantees  it  does  not  increase  infinitely  over  time.  The  stable  region  of  a  pipeline  corresponds  to  optimal 
pipeline  behaviour,  in  the  sense  that  its  end-to-end  frame  latency  is  bounded.  Next  we  present  two  theorems:  the 
first  one  refers  to  the  case  when  the  pipeline  is  in  the  stable  region  and  shows  which  sensor  input  periods  maintain 
the  pipeline  there  for  the  next  frames.  The  second  theorem  handles  the  case  when  the  pipeline  is  not  in  the  stable 
region,  and  gives  a  solution  which  assures  that  the  pipeline  converges  to  the  stable  region  after  a  finite  number  of 
frames. 

Lemma  1  proves  a  useful  relation,  used  in  next  two  theorems’  proofs. 

Lemma  1: 

IfW(i,  n)  >  0  then  the  following  relation  is  true: 

e(i,  n)  -  efi,  n-1)  +  T(i)  -  T  (2) 

Proof: 

W(i.  n)  >  0  =>  W(i,  n)  =  tou,(L  n-I)  -  tout(i-l ,  n) 
n- 1 )  >  tout(i- 1  -  n)  =>  S(i.  n)  =  T(i)  -  C(i) 

/(i.  n)  =  C(i)  +  W(i.  n)  +  S(i.  n)  =  C(i)  +  W(i,  n)  +  T(i)  -  C(i)  =  W(i,  n)  +  T(i) 

W(i,  n)  —  toul(i,  n- 1 )  —  tsensor(n- 1 )  -  tout(i- 1 ,  n)  +  tsensor(n-l ) 

where  t5ensor(x)  is  the  time  instance  when  the  sensor  pushes  the  frame  x 

W(i.  n)  =  e(i.  n-1 )  -  ( tout(i- 1 .  n)  -  tsensor(n- 1 )  -  T  )  -  T  =  e(i,  n- 1 )  -  e(i- 1 ,  n)  -  T 

W(i.  n)  =  e(i,  n-1 )  -  (  e(i-l,  n)  +  /(i,  n) )  +  /( i,  n)  -  T  =>  W(i,  n)  =  e(i,  n-1)  -  e(i,  n)  +  /( i,  n)  -  T 

Implies  e(i.  n)  =  e(i.  n-1)  +  T(i)-  T. 

□ 


Theorem  1  refers  to  the  case  when  pipeline  is  in  the  stable  region.  It  proves  that  it  is  enough  to  maintain  the 
sensor  input  period  greater  than  the  period  of  each  stage  in  order  to  keep  the  pipeline  in  the  stable  region. 
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Theorem  1 

If  the  pipeline  is  in  the  stable  region  for  frame  n-1  and  the  sensor  input  period  T>  max,**  T«).  then  the  pipeline 
stays  in  the  stable  region  for  frame  n. 


Proof: 

We  show  more  generally,  that  if  the  pipeline  is  in  the  state  Sk  for  a  frame  n-1,  0  <  k  <  N,  and  the  input  period 
T  >  max,=o  .k  T(i),  then  the  pipeline  is  in  the  state  Sk  for  frame  n. 

We  show  by  induction  that  e(i,  n)  £  2j=o..i  (  C(j)  +  T(j) )  V  i  0..k 
Stepl:  for  i=0.  We  show  that  e(0,  n)  <  C(0)  +  T(0) ). 

We  have  one  of  the  cases: 

•  W(0,  n)  =  0. 

S(O.n)  £  T(0)  =>  W(0.  n)  +  S(0.  n)  +  C(0)  <  T(0)  +  C(0)  e(0.  n)  <  T(0)  +  C(0) 

•  W(0,  n)  >  0  =>  e(0.n)  =  e(0,  n-1)  +  T(0)  -  T  (  use  relation  2  ) 

T  >  max,=0..K  T(i)  =>  T(0)-T  <  0  =>  e(0.  n)  <  e(0.  n-1) 

The  pipeline  is  in  state  Sk  for  frame  n-1  =>  e(0,  n-1)  <  T(0)  +  C(0)  =>  e(0.  n)  <  T(0)  +  C(0) 

Step2:  suppose  e(i,  n)  <  IJ=0 ,(  T(j)  +  CO) ),  for  i  <  k.  We  show  that  e(i+l,  n)  <  W«(  T0> +  C 0)  > 

We  have  one  of  the  cases: 

•  W(i+1,  n)  =  0 

S(i+1.  n)  <  T(i+1)  =>  W(i+1 .  n)  +  S(i+1.  n)  +  C(i+1)  <  T(i+1)  +  C(i+1 ) 

We  know  that  e(i,  n)  <  Sj-o..i(  TO)  +  CO)  )•  Implies  e('+1-n>  -  ?H- i+l(  T(i)  +  C(j) 

•  W0+l.n)  >  0  =>  e(i+l,  n)  =  e(i+l.  n-1 )  +  T(i+1)  -  T  (  use  relation  2  ) 

T  >  max,=o.K  TO)  =>  T(i+1)  -  T  ^  °  ,  implies  e(i+l,  n)  <  e(i+l.  n-1) 

The  pipeline  is  in  state  Sk  for  frame  n-1  =>  e(i+l,  n-1)  <  Ij-o.vif  TO)  +  CO) ) 

Implies  that  e(i+l.  n)  <  Zj*o.i-i(  TO) +  CO)  )• 

□ 


Theorem  2  refers  to  the  case  when  the  pipeline  is  not  in  the  stable  region.  It  provides  a  solution  to  the  case  when 
the  pipeline  latency  is  too  high,  and  proves  that  it  is  enough  to  adjust  the  sensor  input  period  in  order  to  bring  the 
pipeline  end-to-end  latency  into  the  stable  region,  when  the  latency  is  superior  limited. 

Theorem  2 

If, he  pipeline  is  NOT  »  ,he  sioble  region  for  frame  n-1  and  siariing  wiih  U he frame  n  ,he  sensor  inpu,  period 
T  >  max,.,,,  TO),  then  the  pipeline  converges  into  the  stable  region  after  a  finite  number  of  frames. 


Proof: 

Let  us  note  the  pipeline  current  state  I.  where  I  *  SN. 
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We  show  by  induction  that  starting  with  frame  n  the  pipeline  behaves  like: 

I— >  So  — >  Sj  — >  S2  —>  ...  — >Sn 
mo  m  ]  m2  mN 

where: 

mj  is  the  number  of  frames  needed  by  the  pipeline  in  state  Sm  to  converge  in  the  state  Si?  0  <  i  <  N 
m,>  0,  V  0  <  i  <N 

Stepl :  show  that  So  after  a  finite  number  of  frames  m0 
Suppose  I*  S0  (otherwise  we  are  done,  with  m0=  0). 

We  show  that  for  each  new  arriving  frame  x,  e(0,x)  decreases  compared  with  previous  frame  value,  until  it  becomes 
less  than  T(0)  +  C(0),  at  which  time  the  pipeline  is  in  state  S0. 

We  have  one  of  the  cases: 

•  W(0,  x)  =  0 

e(0,  x)  =  W(0,  x)  +  S(0,  x)  +  C(0)  <  T(0)  +  C(0)  =>  starting  with  this  frame  x  the  pipeline  is  in  state  S0. 

•  W(0,  x  )  >  0 

e(0,  x)  =  e(0,  x-1)  +  T(0)  -  T  (use  relation  2) 

T  >  max,=0.  NT(i)  :=>  T(0)-  T  <0,  implies  that  e(0,  x)  <  e(0,  x-1 )  :=>  end-to-end  latency  up  to  the  stage  0 
decreases  between  frames  x-1  and  x. 

The  same  process  happens  again  over  successive  frames,  until  the  pipeline  gets  in  the  state  S0.  The  number  of 
frames  after  which  the  pipeline  gets  in  state  So  is  : 

_  [  e(0)  -  (7"  (0 )  +  C(0))' 

W"  ‘I  T-T( 0) 

where  e(0)  is  the  end  to  end  latency  up  to  stage  0,  when  pipeline  is  in  state  I. 

Note:  the  greater  the  input  period  T,  the  smaller  mQ,  so  the  earlier  the  pipeline  converges  to  stage  S0 

Step2:  Suppose  the  pipeline  is  in  the  state  Sj.  We  show  that  after  a  finite  number  of  frames,  nvi.the  pipeline  enters 

state  Sj-j. 

Suppose  the  pipeline  is  not  in  Sj-i(otherwise  we  are  done  with  ml+1  =  0) 

=>  end  to  end  latency  up  to  the  stage  i+1  =  e(i+l)  >  Zj=o..i+i(  T(j)  +  C(j)  ) 

We  show  that  for  each  new  arriving  frame  x,  e(i+l,  x)  decreases  compared  with  previous  frame  value,  until  it 
becomes  less  than  Sj=o„i+i(  T(j)  +  C(j) ),  moment  by  which  the  pipeline  is  in  state  Si+i  . 

We  have  one  of  the  cases: 

•  W(i+1,  x)  =  0 

e(i+l,  x)  =  e(i,  x)  +  W(i+1,  x)  +  S(i+1,  x)  +  C(i+1)  <  e(i,  x)  +  T(i+1)  +  C(i+1) 
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We  know  that  e(i,  x)  <  Zj-o..i(  T(j)  +  C(j) )  =>  e(i+l,  x)  <  Lj=o..i+i(  T(j) +  C(j) ) 

=s>  starting  with  this  frame  x  the  pipeline  is  in  state  Sj*i  . 

•  W(i+1,  x)  >  0 

e(i+l,  x)  =  e(i+l,  x-1)  +  T(i+1)  -  T  (use  relation  2) 

T >  maXj=o..N  T(j)  =>  T(i+1)-T  <  0  =>  e(i+l,  x)  <  e(i+l,  x-1) 

=>  end  to  end  delay  up  to  the  stage  i+1  decreases  between  frames  x-1  and  x 
The  same  behavior  repeats  over  successive  frames,  until  the  pipeline  gets  in  the  state  Si+).  The  number  of  frames 

after  which  the  pipeline  gets  in  state  Si+i  is: 

e(/  +  l)-2(TO)  +  C(») 

_ _ 

m "  T  -  T(i  +  1) 

where  e(i+l)  is  the  pipeline  end-to-end  latency  up  to  the  stage  i+1.  at  the  instance  the  pipeline  gets  to  state  S, 

Note:  the  greater  the  input  period  T,  the  smaller  mM,  so  the  earlier  the  pipeline  converges  in  stage  SM. 

□  . 

We  have  demonstrated  that  by  increasing  the  sensor  input  period  above  the  maximum  period  of  all  pipeline 

stages,  the  end-to-end  latency  converges  to  the  stable  region.  The  theoretical  results  presented  before  proved  the 
stability  of  our  pipeline  control  method. 


Example 

The  next  example  illustrates  how  the  pipeline  end-to-end  latency  converges  in  time  to  the  stable  region  when  the 
input  sensor  period  is  increased  above  the  maximum  period  of  all  stages. 

Consider  the  following  instance  of  a  9  stage  pipeline: 


waiting  time 


Figure  5.  Example  of  linear  pipeline 

T(i).  C(i).  T,  latency  are  represented  in  arbitrary  time  units.  The  end-to-end  latency  is  164,  the  stable  region  is 
[26,  68],  and  maXj=o.,8T(>)  =11- 
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In  conformity  with  Theorem  2,  if  the  sensor  input  period  is  greater  than  1 1 ,  the  end-to-end  latency  converges  to  the 


Figure  6.  Latency  variation  depending  on  Tinpu, 

stable  region.  Figure  6  shows  the  pipeline  behavior  for  T  =  12,  13  and  14.  We  can  observe  that  the  greater  the 
sensor  input  period  T  is,  the  earlier  the  pipeline  enters  the  stable  region.  According  to  theorem  1,  once  the  pipeline 
enters  the  stable  region,  it  remains  there  as  long  as  T  >  1 1 . 


3.2.  Generalization  for  Clock-Based  Pipeline  With  Composite  Stages 

This  section  generalizes  the  results  achieved  in  the  previous  section.  It  presents  the  clock-based  pipeline  with 
composite  stage  model  and  the  main  results.  Many  distributed  data-flow  applications  have  a  complex  structure  with 
branches  and  parallel  substages.  One  example  is  the  ATR  application  depicted  in  Figure  9.  We  model  these 
architectures  as  a  linear  pipeline  with  simple  and  composite  stages.  Figure  7a.  illustrates  a  simple  stage  and  7b.  a 
composite  stage. 


stage  i 


a)  simple  stage  b)  composite  stage 

Figure  7.  A  simple  and  a  composite  pipeline  stage 
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A  simple  stage  represents  a  single,  indivisible  task  that  processes  a  frame.  A  composite  stage  i  consists  of 
substages  arranged  in  parallel  branches  that  process  parts  of  a  frame.  A  substage  branch  works  like  the  simple  linear 
pipeline  presented  in  Section  3.1.  When  last  substage  of  each  branch  finishes  the  processing,  the  frame  is 
reassambled  at  stage  i+1. 

We  proved  that  for  this  type  of  pipeline  the  results  obtained  previously  for  clock-based  simple  pipeline  are  valid, 
setting  the  input  period  greater  than  the  maximum  period  of  all  stages/substages  guarantees  the  pipeline 
convergence  to  the  stable  region  after  a  finite  number  of  frames.  Once  it  enters  the  stable  region,  the  pipeline 
remains  there  as  long  as  the  sensor  input  period  is  greater  than  the  maximum  period  of  all  stages/substages.  Due  to 

space  limitation  we  do  not  present  here  the  formal  proofs. 

The  next  section  describes  how  pipeline  feedback  adaptation  works  in  RTARM  applied  to  an  ATR  application. 
It  gives  also  some  measurements  and  performance  evaluations. 


4.  RTARM  Hierarchical  Feedback  Adaptation  for  Pipeline  Applications 

The  top-most  HSM  that  receives  the  admission  request  directly  from  the  user  client  remains  in  control  of  the 
application  QoS  and  its  dynamics  for  its  entire  lifetime.  That  HSM  is  responsible  for  maintaining  the  distributed 
application's  QoS  within  the  contracted  region  and  to  improve  it  when  possible  using  feedback  adaptation.  The 
resource  management  system  must  react  quickly  and  adjust  online  the  application  parameters  in  case  of  allocated 
resource  abuse  or  contract  violation. 

In  RTARM  we  have  designed  and  implemented  an  efficient  hierarchical  feedback  adaptation  mechanism  and 
applied  it  to  parallel  pipeline  applications  and  independent  tasks,  using  the  results  developed  in  section  3.  The 
RTARM  hierarchy  consists  of  a  pipeline  HSM,  a  network  SM  and  several  CPU  SMs  acting  as  LSMs.  The  network 
SM  does  not  provide  feedback  adaptation.  The  reserved  network  resources  must  cover  the  entire  range  of 
application  rate.  According  to  our  analysis,  it  is  possible  to  control  the  end-to-end  frame  latency  for  the  entire 
pipeline  just  by  controlling  the  rate  of  the  input  sensor  or  first  stage.  This  allows  the  CPU  SMs  to  conduct  local 
feedback  adaptation  for  each  individual  pipeline  stage  in  order  to  provide  locally  the  best  QoS  within  the  contracted 
range.  Thus,  feedback  adaptation  for  the  entire  pipeline  and  CPU  stages  is  conducted  independently. 

CPU  Service  Manager  Feedback  Adaptation 

CPU  SMs  run  pipeline  stages  just  like  any  regular  periodic  independent  task.  In  fact 
CPU  SMs  have  no  idea  these  tasks  are  part  of  a  higher  level  entity,  and  they  perform  all 
RTARM  functions  in  the  same  way.  As  mentioned  in  section  2.4  the  CPU  SM  QoS 
(Figure  8)  consists  of  rate  and  iteration  workload  (execution  time),  both  specified  as 
intervals  [min.  max].  The  CPU  SM  can  directly  control  the  application  rate,  but  cannot 
touch  the  application  workload.  The  CPU  SM  uses  the  product 


Workload 


Figure  8.  CPU  SM  QoS 
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CPU  ^utilization  =  Rate  x  Workload  to  asses  schedulability.  Applications  send  their  actual  QoS  as  events  to  CPU 
SM  monitor  at  the  end  of  each  periodic  iteration.  The  application  is  allocated  a  constant  fraction  L  of  the  total 
processor  time.  At  any  time  the  current  operational  point  (COP)  may  vary  so  that R  xW<L .  The  CPU  SM  adjusts 
the  current  operational  point: 

•  increase  rate  when  workload  decreases 

•  decrease  rate  on  overload 


Pipeline  feedback  adptation 

The  pipeline  QoS  parameter  we  consider  critical  and  want  to  control  is  the  end-to-end  latency.  As  the  pipeline 
evolves  in  time,  rates  of  intermediate  stages  may  change  as  a  result  of  CPU  SM  feedback  adaptation.  In  normal 
circumstances  the  input  sensor  period  is  maintained  at  a  value  greater  than  the  period  of  any  stage/substage  of  the 
parallel  pipeline  application,  but  it  can  get  lower  because  of  independent  CPU  feedback  adaptation.  When 
accumulation  of  queues  between  stages  increases  the  end-to-end  latency  beyond  a  maximum  threshold,  the  PSM 
sets  the  input  sensor  period  at  the  maximum  value  from  the  pipeline  contract.  A  finite  state  machine  in  the  PSM 
maintains  this  maximal  period  for  a  fixed  time,  allowing  the  queues  to  empty.  Then,  the  PSM  sets  again  the  input 
sensor  rate  to  the  maximal  period  of  all  stages.  In  this  way,  we  know  that  the  end-to-end  latency  decreases  and  after 
a  finite  number  of  frames  the  pipeline  enters  the  stable  region  (section  3). 

This  method  is  simple  and  efficient,  as  the  only  parameter  to  be  adjusted  is  the  sensor  input  period,  while  the 
pipeline  stages  are  controlled  only  by  the  corresponding  CPU  SM.  This  mechanism  avoids  costly  communication 
and  coordination  between  the  HSM  and  all  the  CPU  SMs.  The  information  required  for  pipeline  feedback 
adaptation  is  minimal:  the  end-to-end  latency  for  the  current  frame  and  the  maximal  period  of  all  stages. 

Another  option  for  pipeline  feedback  adaptation  would  have  been  to  let  the  PSM  directly  adjust  online  the  rate 
for  each  stage.  In  this  case  the  PSM  would  have  to  keep  track  of  the  current  workload  and  rate,  and  maybe  queue 
lengths  for  all  stages,  implying  extra  communication,  processing  overhead  and  lower  resource  utilization  for  CPU 
service  managers. 


4.1.  The  Automatic  Target  Recognition  Experiment 

We  tested  the  RTARM  system  and  the  feedback  adaptation  mechanism  on  a  true  mission-critical  application 


The  ATR  application,  schematically  shown  in  Figure  9, 
processes  video  frames  captured  by  a  camera  and 
displays  recognized  targets  on  a  display.  Stage  0  (the 
sensor)  generates  frames  that  are  passed  through  a  series 
of  filters  and  processing  elements  up  to  stage  6,  which 
displays  the  original  image  and  the  identified  targets. 
The  frames  are  8-bit  monochrome  images,  360x360 
pixels  and  contain  a  variable  number  of  targets  (from  3 


_____ _  -  lime 

End-to-End  Latency  Frame  Arrival  Period 


Figure  9:  ATR  pipeline  application  and  QoS 
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to  50),  depending  on  the  frame.  Stages  4,  5  and  6  expose  variable  workload,  proportional  to  the  number  of  targets, 
that  without  feedback  adaptation  would  generate  queue  accumulations  with  negative  effect  to  the  end-to-end  frame 

latency. 

4.2.  Performance  Metrics  and  Evaluation 

The  runtime  environment  for  the  ATR  experiment  consists  of  three  450MHz  NT  Dell  Workstation  400 
machines,  connected  via  a  Fore  ATM  switch  with  OC-3c  (155Mbps)  links.  Each  machine  hosts  a  CPU  SM.  Both 
the  network  SM  and  the  pipeline  SM  run  on  one  of  those  three  machines,  and  we  consider  their  own  CPU  resource 
consumption  negligible.  All  inter-SM  CORBA  communication  uses  a  secondary  Fast  Ethernet  network,  so  the 
ATM  lines  remain  1 00%  available.  We  used  the  NT  performance  counter  for  precise  measurements. 

The  ATR  pipeline  contract  requires  an  acceptable  output  frame  period  interval  of  [1,5]  s,  and  a  frame  latency  of 
0.7-13  s.  The  seven  ATR  stages  run  at  a  variable  workload  between  0.02  and  1.5s  and  within  the  same  period 

interval  [1,5]  s. 

We  first  present  timing  measurements  for  the  feedback  adaptation  at  the  CPU  SM  and  PSM  SM  level  (Figure 
10).  We  measured  the  processing  overhead  of  the  feedback  adaptation  code  (part  2  in  Figure  10)  and  the  time  it 
takes  the  SM  to  react  from  the  moment  it  receives  the  current  QoS  from  the  application  until  its  adaptation 
command  is  enforced  (part  2  +  part  3). 


Figure  10.  Feedback  adaptation  performance 
measurements. 

The  measured  times  are  displayed  in  Table  1.  For  the  CPU  feedback  adaptation,  detection  and  enforcing  the  QoS 
adaptation  takes  around  4.4ms.  Most  of  the  time,  3.9ms,  is  spent  in  a  set_qos()  operation,  a  two-way  normal 
CORBA  call.  The  pipeline  adaptation  enforcement  includes  a set_qos()  call  to  the  CPU  SM  that  controls  the  sensor 
(or  first  stage)  that  calls  directly  the  application  with  a set_qos()  call.  This  explains  why  enacting  pipeline  QoS 
adaptation  takes  almost  double  the  time  than  that  for  CPU  SM  QoS.  _ 


Detection  and  decision  processing  (2) 

Decision  Enactment  (3) 

Total  Time  (2+3) 

CPU  SM 

0.508  ms 

3.914  ms 

4.422  ms 

Pipeline  SM 

0.859  ms 

6.816  ms 

7.675  ms 

Table  1 .  Feedback  adaptation  performance  results  for  CPU  SM  and  PSM 
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Figure  1 1  displays  CPU  feedback  adaptation  for  stage  4  in  the  ATR  pipeline.  The  stage  has  a  period  and  variable 
workload  and  this  causes  the  CPU  SM  to  change  the  rate.  Points  A  indicate  overloads  that  trigger  rate  decrease  and 
points  B  indicate  chronic  underutilization  points  that  determine  a  rate  increase. 


CPU  Feedback  Adaptation 


IOC  110  120  130  140  150  160  170  1#0  190  200 

Eirfrlmtnt  tint  (ttcontfi) 

..  . »  Workk>»0  — . CPU  Loa0  =  R*t»  «  Workload 


Figure  1 1.  CPU  SM  feedback  adaptation  for  a  task  with  variable  workload. 

While  running  the  ATR  application  (Figure  12),  the  pipeline  feedback  adaptation  mechanism  makes  sure  the 
end-to-end  latency  and  rate  stay  in  the  contracted  range.  In  order  to  practically  demonstrate  its  effectiveness,  we 
disabled  the  pipeline  feedback  adaptation  after  some  time  while  keeping  the  sensor  input  period  at  a  sustained  low 
value  of  1.48s  (0.67Hz).  This  caused  accumulation  of  frames  in  stage  queues  that  translated  into  an  increasing  end- 
to-end  frame  latency.  While  feedback  adaptation  was  disabled  we  actually  did  not  get  latency  measurements,  so  we 
drew  a  dotted  line  between  points  A  and  B.  When  the  latency  reached  30s,  way  above  the  contracted  value,  we  re¬ 
enabled  pipeline  feedback  adaptation.  Immediately  the  PSM  sensor  increased  the  sensor  input  period  up  at  5s.  The 
latency  went  rapidly  down  (B  ->  C),  below  the  threshold,  after  a  brief  spike  caused  by  the  inertia  of  the  more  than 
23  frames  already  in  transit  through  the  pipeline. 


_ Sensor  Input  Period 

——  End-to-end  Latency 
mm  Threshold 


Figure  12:  Latency  Variation  for  ATR  with  and  without  pipeline  feedback  adaptation 
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Our  hierarchical  feedback  adaptation  algorithm  proved  to  be  effective  and  efficient.  Detection,  decision  and 
enforcement  take  less  than  8ms  and  involve  only  the  CPU  SMs  for  the  sensor  stage  and  the  last  stage  that  actually 
reports  the  latency  and  rate. 


5.  Conclusion 

This  paper  presented  briefly  the  Real-Time  Adaptive  Resource  Management  system,  its  architecture  and 
flexibility.  We  developed  a  feedback  adaptation  mechanism  for  distributed  data-flow  applications  based  on  an 
analytical  model.  We  proved  its  correctness  and  stability  and  demonstrated  its  effectiveness  by  running  an 
Automatic  Target  Recognition  parallel  pipeline  application  on  a  network  of  workstations  managed  by  the  RTARM 
system.  Our  innovative  pipeline  control  method  uses  minimal  information  about  the  current  state  of  the  pipeline 
application  and  requires  only  one  action  to  correct  the  end-to-end  frame  latency. 

A  direction  for  future  work  is  to  add  prevention  features  to  the  current  feedback  adaptation  method.  Right  now, 
it  only  takes  corrective  actions  when  the  QoS  falls  below  a  threshold.  Preventive  actions  would  further  decrease  the 
overall  pipeline  reaction  time.  We  also  plan  to  study  the  feedback  adaptation  for  parallel  pipeline  applications 
where  several  pipeline  HSMs  have  exclusive  control  over  parts  (sub-pipelines)  of  the  entire  distributed  application. 
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